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SUMMARY 


Technical  Problem 


rz^;^;:^:r,',tion''  con,r"u  wi'" ,hc  «— * *•*»  „» 

££Z£  ,he  ARMNE“"'k'"  ‘"Kl  rdi"l’IC  CO"fiSl"a,ions  •“  r«w,h  re- 

• To  <„.dv  ihe  properties  of  packet  switched  computer  communications  networks. 
■ TO  develop  techniques  lo,  the  analysis  and  dcsiBn  of  large  scale  networks. 

bni(l,|  PlV  rCCCm  compi,tcr  ac|vances,  such  as  interactive  display  devices  and  distri 
bUtC‘l  U,mpUlm«- IO  lhc  *«**  ‘«ncl  design  of  large  scale  networks. 

General  Methodology 

The  approach  to  the  solution  of  these  problems  has  been  the  simultaneous, 

■ study  or  fundamental  network  analysis  and  design  issues, 

- development  of  efficient  algorithms  for  large  scale  network  analysis  and  design, 

dispiay  » 

SS.U'  Z'Z**  m"  **"  «*  «*  cos,  and  performance 

Technical  Results 

in  this  report,  the  following  major  accomplishments  are  discussed: 

las  completed' This" * 7**!  0riC"IC‘l  "CUv0,k  «*<  J"‘<  Performance 

TIP  rnd  ANTk  ll  T dlrCCtClt  ■“  dcw|0Pi'>8  techniques  for  „p,imiri„fi 

1 ip  and  ANTS  like  nodes  in  large  networks.  h 
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The  new  large  network  design  technique,  based  on  ’-cut-saturation'1,  reported  in 
Semiannual  Report  2 was  extended  to  networks  with  multiple  capacity  options. 


A method  for  determining  absolute  lower  bounds  on  cost  for  packet  switched  net 
works,  given  traffic  and  delay  requirements,  was  developed. 


The  second  phase  of  an  interactive  network  data  handling  system  has  bee 
pletcd  for  an  IMLAC  display  editing  system  for  large  network  graphics. 


n coiti- 


Ihc  second  phase  of  a detailed,  event  oriented  simulation  model  to  develop  flow 
control  and  routing  algorithms  was  completed  for  the  packet  radio  system  This 
system  is  now  operating  on  the  ARPANET  with  PDP  10  editing  and  360/91  R IS 
for  extensive  computations.  * 


Major  progress  was  made  in  the  development  of  labeling  and  initialization 
for  the  packet  radio  system. 


schemes 


* A model  for  Ihc  sct  covering/repeator  location  problem  was  developed  and  tested. 

An  extensive  study  of  How,  delay  and  throughput  in  packet  radio  networks  was 
completed. 

Department  of  Defense  Implications 


The  Department  of  Do dense  has  vital  need  for  highly  reliable  and  economical  communica- 
trons.  The  results  described  in  this  reporting  period  reinforce  conclusions  of  earlier  periods 
about  the  validity  of  packet  switching  for  massive  DOD  data  communications  problems  A 
maior  portion  of  the  cost  of  implementing  this  technology  will  occur  in  providing  local  ac- 
cess to  the  networks.  Hence,  the  development  of  local  and  regional  communication  tech- 
niques must  be  given  high  priority. 

Implications  for  Further  Research 

Further  research  must  continue  to  develop  tools  for  the  study  of  large  network  problems 
These  tools  must  be  used  to  investigate  tradeoffs  between  terminal  and  computer  density 
traff,c  variations  the  cflects  of  improved  local  access  scheircssuch  as  packet  radio,  the  use 
of  domestic  satellites  in  broadcast  mode  for  backbone  networks,  and  the  effect  of  link  and 
computer  hardware  variations  in  reliability  on  overall  network  performance.  The  potential 
of  these  nctwoiks  to  the  DOD  establishes  a high  priority  for  these  studies. 
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A CUT  SATURATION  ALGORITHM  FOR  TOPOLOGICAL  DESIGN  OF 
PACKET  SWITCHED  COMMUNICATION  NETWORKS 
PART  2 


I. 


INTRODUCTION 


The  cut-saturation  (C  S)  technique  for  the  topological  design 
of  packet-switching  networks  was  first  presented  in  Semi-annual 
Report  #2  [ 1]  and  was  shown  to  be  computationally  much  more 
effective  than  other  available  techniques.  The  algorithm  described 
in  Ref.  [ l ] applies  to  the  design  of  networks  in  which  all  communi- 
cations circuits  have  the  same  preassigned  line  speed  (e.g.  all 
50  kb/s  line  capacities  etc.) 

This  report  extends  the  C S technique  to  the  topological  net- 
work design  with  multiple  line  capacity  options. 

The  topological  design  problem  here  addressed  can  be  very 
generally  formulated  as  follows: 


1.  Given  the  node  locations,  the  traffic  requirements 
between  such  locations  and  the  line  capacity  options 
available; 


2.  Minimize  the  total  line  and  modem  cost; 


3.  Such  that  the  average  delay  and  connectivity  re- 
quirements are  satisfied. 


The  use  of  multiple  capacity  options  can  provide  substantial 
cost  savings  in  several  situations.  In  particular,  it  is  desirable 
to  consider  topological  implementations  which  include  different 
line  speeds  in  the  following  cases  ? 
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A.  Non-Uniform  Requirements 


When  different  pairs  of  nodes  have  different  throughput, 
delay  and  bandwidth  requirements,  it  is  often  possible  to  satisfy 
such  requirements  in  a cost  effective  manner,  by  providing  higher 
capacity  connections  between  selected  node  pairs. 


B . Large  Capacity  Gaps 

Even  if  node  pair  requirements  are  uniform,  a non  uniform 
channel  capacity  assignment  can  be  desired  when  there  exist  large 
gaps  between  the  available  capacity  options,  or  when  line  tariffs 
have  an  irregular  cost  structure.  In  such  cases,  for  a uniform 
capacity  implementation  one  has  to  choose  between  two  capacity 
alternatives,  where  the  higher  alternative  is  typically  too  ex- 
pensive, and  the  lower  is  not  adequate  to  satisfy  all  of  the  re- 
quirements, even  if  highly  connected  topologies  are  considered  in 
the  attempt  of  improving  network  performance.  A blend  of  the  two 
options  generally  provides  the  best  solution. 

C . Large  Scale  Economies 

If  the  tariif  structure  offers  a volume  discount  in  the 
cost  per  unit  bandwidth  of  leased  wide  band  channels,  then  substan- 
line  savings  can  be  obtained  by  using  a few  channels  with  speed 
higher  than  average,  to  accommodate  high  volume  requirements  between 
different  network  regions,  and  lower  speed  channels  to  satisfy  re- 
gional requirements.  In  the  limit,  the  presence  of  a strong  volume 
discount  and  a large  network  size  might  justify  a hierarchical  net- 
work implementation,  with  a different  capacity  selection  for  each 
hierarchical  level. 


1.2 


“***  1 '’-1  -.1  wto. !-  *4 


rt  1 - J — ,J  "-T-1 


f 


r* — t : 


Network  Analysis  Corporation 


D.  Inadequate  Reliability 

In  the  design  of  communications  networks  it  often  happens 
.hat  topologies  which  are  very  satisfactory  from  throughput  and 
elay  point  of  view,  are  nevertheless  inadequate  from  the  re- 
liability point  of  view,  one  well  known  technique  for  the  improve- 
-nen  o network  reliability  consists  of  increasing  network  connec- 
ivity  with  the  introduction  of  new  links.  In  order  to  optimize 

he  cost-reliability  trade  off,  links  of  lower  capacity  are  often 
introduced. 

E • Network  Growth 

During  the  life  of  a data  network  it  is  likely  that  at 

some  point  in  time,  tariff  changes  or  new  communications  offerings 

will  make  it  desirable  to  use  channel  types  and  speeds  different 

rom  the  ones  seiected  for  the  original  design,  since  it  is  often 

impossible  for  practice!  reasons  to  reoptimize  in  one  shot  the 

topological  configuration  using  only  the  new,  more  advantageous 

o «™gs,  a partial  reconfiguration  is  in  general  performed, 

in  reducing  the  new  services  where  most  beneficial.  A topological 

esign  which  accounts  for  multiple  capacity  options  is  therefore 
required. 


1 though  multiple  capacity  options  can  introduce  substantial 
cost  savings,  as  mentioned  in  the  above  cases,  they  nevertheless 
require  more  sophisticated  routing  and  flow  control  procedures 
han  the  single  option  case.  For  example,  in  the  uniform  capacity 
case  the  routing  program  needs  to  consider  only  channel  utilizations 
or,  equivalently,  channel  queue  lengths),  while  a non-uniform 
capacity  routing  program  must  account  for  utilization,  transmission 
ate  propagation  delay  and  error  characteristics  relative  to  the 
specific  capacity  option  implemented  on  the  channel. 
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Another  important  aspect  to  be  considered  is  file  transfer 
bandwidth,  i.e.,  the  maximum  data  rate  at  which  a file  can  be 
transfered  through  the  network.  An  approximate  evaluation  of  file 
transfer  bandwidth  can  be  done  by  assuming  that  the  data  travels 
along  only  one  source  to  destination  path,  and  that  packets  are 
sent  one  after  the  other  like  in  a "pipeline",  without  waiting  for 
RFMM's  or  buffer  allocation  confirmations  etc.  Under  such  assump- 
tions, the  uniform  capacity  network  bandwidth  is  clearly  equal  to 
the  value  of  channel  capacity,  while  in  the  non-uniform  capacity 
network  implementation  the  bandwidth  along  a source  to  destination 
path  is  equal  to  the  minimum  capacity  .in  the  path.  Recalling  that 
the  average  source  to  destination  single  packet  delay  is  related 
to  the  average  value  of  capacity  along  a path,  while  the  maximum 
bandwidth  is  related  to  the  minimum  value  of  capacity,  it  will  be 
observed  that  the  file  transfer  performance  is  typically  superior 
in  a uniform  capacity  network  than  in  a non-uniform  one,  assuming 
that  average  delay  performance  is  the  same. 

All  the  above  issues  must  be  considered  in  order  to  decide 
whether  to  use  multiple  capacity  options,  or  how  many  options  to 
use,  for  each  network  design  application.  During  the  network  plan- 
ning phase  it  is  important,  therefore,  to  have  available  an  effi- 
cient algorithim  that  generates  low  cost,  multiple  capacity  network 
topologies.  Thus,  the  practical  importance  of  the  algorithm 
described  in  the  sequal. 
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II.  background 


The  C S algorithm  for  multiple  capacity  options  is  an  exten- 
sion of  the  C s algorithm  for  fixed  capacities.  This  section 
summarizes  the  general  concepts  and  properties  of  the  c 3 approach; 
a more  detailed  discussion  is  found  in  [ l ] . 

A cut  is  a set  of  lines  whose  removal  will  disconnect  the 

network.  A cut  is  saturated  if  the  traffic  load  in  every  line  of 

e cut  equals  line  capacity.  There  are  in  general  a large  number 

cuts  in  a network,  when  traffic  load  grows,  one  of  the  cut- 

approaches  saturation.  This  cut  is  the  bottleneck  to  any  further 

throughput  increase;  therfore,  if  a higher  throughput  is  desired, 

capacity  of  the  cut  must  be  increased  by  upgrading  some  of  the 

line  capacities  in  the  cut  and/or  introducing  new  lines  across 
the  cut. 

Conversely,  it  is  intuitively  obvious,  and  has  been  experi- 
mentally confirmed,  that  the  reduction  or  elimination  of  low 
utilized  links  within  each  of  the  partitions  separated  by  the 

Sf  Ur‘,te<3  CUt  WlU  in  9eneral  res“lt  in  only  a marginal  reduction 
of  network  throughput  if  the  network  is  at  least  2-connected. 

The  c S algorithm  is  based  on  the  above  stated  concepts.  The 
algorithm  attempts  to  keep  network  throughput  within  specified 

~tf  ?ilter?VelY  redUCin9  °Vera11  maintaining 

pacxty,  delay  and  reliability  constraints. 

abiliIvSofffT1VeneSS  °f  thS  al90rith”  iS  "lGSely  elated  *>  our 
bility  of  solving  the  three  following  steps: 


A. 


Determination  of  the  saturated 


cut  set. 


B.  Optimal  increase  of  cut  set  capacity  (obtained 
with  appropriate  link  insertion  of  upgrading) . 
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C.  Reduction  or  removal  of  links  not  in  the  cut  set. 


The  three  above  steps  are  extensively  discussed  in  a later 
section . 
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III.  STATE-OF-THE-ART 


The  topological  design  problem  for  packet  switching  networks 
has  not  been  given  in  the  past  the  same  attention  that  was  given  to 
other  similar  problems  (e.g.  the  Telpack  problem,  centralized  net- 
work problems  etc.)  This  is  in  part  because  the  packet-switching 
technology  is  relatively  new,  and  also  because  the  design  of  a 
packet-switching  network  requires  as  essential  ingredients  the 
solution  of  complex  queueing,  routing  and  reliability  problems. 
Therefore,  the  literature  is  rather  scarce  on  this  subject. 

Among  the  few  contributions  available  we  mention  here  the 
Branch  Exchange  Method  (BXC)  [ 2 ] and  the  Concave  Branch  Elimin- 
ation (CBE)  [ 3 ] . 

The  BXC  method  starts  from  an  initial  feasible  topology  and 
performs  at  each  step  a simple  topological  transformation  (branch 
exchange).  If  cost- throughput  trade  off  is  improved,  the  trans- 
formation is  accepted;  otherwise  it  is  rejected.  The  procedure 
stops  after  a large  number  of  local  transformations  is  systemati- 
cally explored.  The  BXC  method  does  not  have  the  capability  of 
identifying  at  each  step  the  subset  of  transformations  which  are 
most  likely  to  produce  performance  improvements;  therefore  all 
transformations  must  be  systematically  explored.  BXC  is  a very 
time  consuming  algorithm  and  its  application  is  limited  to  small 
or  medium  size  networks. 

The  CBE  method  can  be  applied  whenever  the  discrete  costs 
corresponding  to  the  multiple  capacity  options  can  be  reasonably 
approximated  by  concave  curves.  The  method  consists  of  starting 
from  a fully  connected  topology,  using  concave  costs  and  repeatedly 
applying  a minimum  cost  routing  algorithm  [ 3 ] until  a local 
minimum  is  reached.  Typically,  the  algorithm  eliminates  uneconomical 
links,  and  strongly  reduces  the  topology.  Once  a locally  minimal 
topology  is  reached,  the  discrete  capacity  solution  can  be  obtained 


: 
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from  the  continuous  solution  with  an  appropriate  selection  of 
capacity  options.  Since  2-connectivity  is  required,  the  algorithm 
is  terminated  whenever  the  next  link  removal  violates  this  con- 
straint; the  last  2-connected  solution  is  then  assumed  to  be  the 
local  minimum.  In  order  to  obtain  several  local  minima,  and  there- 
fore several  different  topological  solutions,  the  algorithm  is 
applied  to  several  randomly  chosen  initial  flows. 

The  CBE  algorithm  has  a better  knowledge  of  the  structure  of 
the  problem  (network  topology,  cost-capacity  volume  discount  etc.) 
than  the  BXC  method  and  improves  network  performance  at  each  inter- 
ation.  It  is  also  less  time  consuming  than  BXC,  since  it  explores 
only  a selected  subset  of  solutions  (namely  the  local  minima) . 
However,  it  suffers  the  following  limitations:  it  can  only  remove 
links  (i.e.  , no  links  are  added  during  the  optimization  procedure)  , 
and  it  requires  smooth  cost-capacity  curves,  without  large  gaps  and 
irregularities . 

In  summary,  the  presently  available  methods  are  either  computa- 
tionally inefficient,  or  they  are  able  to  handle  only  very  special 
line  tariff  structures.  There  is  therefore  the  need  for  algorithms 
which  are  computationally  efficient,  and  that  can  handle  very  general 
line  cost  structures.  The  C S method  described  in  the  sequal  has 
been  developed  with  the  intent  of  meeting  the  two  above  objectives. 
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THE  CS  METHOD  FOR  MULTIPLE  CAPACITY  OPTIONS 


A.  General 


The  CS  method  for  multiple  capacity  options  is  based 
on  the  same  concepts  that  inspired  the  development  of  the  original 
CS  method  for  fixed  capacities.  Some  changes  and  some  new  features 
were  required  in  the  practical  implementation  of  the  algorithm, 
due  to  the  presence  of  multiple  capacity  levels.  The  major  changes 
are  relative  to: 

1.  The  determination  of  the  saturated  cut; 

2.  The  insertion  or  upgrading  of  links; 

3.  The  deletion  or  reduction  of  links;  and 

4.  The  determination  of  starting  topologies. 

These  issues  are  discussed  in  the  following  sections. 

B . Determination  of  the  Saturated  Cut 

The  saturated  cut  is  found  using  the  following  procedure. 
Network  throughput  is  first  increased,  using  a very  efficient 
routing  algorithm,  until  the  delay  constraint  is  satisfied  with 
equality.  The  output  of  the  program  consists  of  the  optimal 
values  of  cost,  throughput  and  delay,  and  the  value  of  the  optimal 
flows  in  all  links.  Next,  links  are  ordered  according  to  their 
utilization , which  is  defined  as  the  ratio  of  flow  to  capacity, 
and  are  removed  starting  from  the  most  utilized  one,  while  the 
network  becomes  disconnected.  The  minimum  disconnecting  set  of 
links  is  defined  as  the  saturated  cut. 
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It  was  already  observed  in  [ 1 ) that,  for  an  appropriate 
determination  of  the  saturated  out,  the  chains  of  nodes  that  carry 
prevalently  transit  traffic  must  be  "collapsed"  into  a single 

equivalent  link,  before  the  above  described  removal  operation  is 
performed. 

The  criterium  presented  in  [ 1 ] to  determine  whether  a 
chain  is  collapsable  or  not,  is  rather  heuristic  and  not  sufficiently 
precise  for  the  multiple  capacity  case,  where  the  links  in  the  same 
cham  can  have  different  capacity  and  different  utilization  values. 
Therefore,  the  following  more  accurate  criterium  was  developed, 


Let  us  assume  that  the  traffic  requirement  matrix  R = [r  ] 
is  symmetric  and  therefore  in  each  link  the  flow  is  the  same  in  both 


directions ; let  f±  and  fR  be  the  terminal  flows  in  the  chain  shown 


in  Figure  1. 


FIGURE  1 . COLLAPSABLE  CHATN 


Let  Q - l 
iel 


I 


c ^Ic 


r . . 
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(where  Ic  is  the  set  of  nodes  in  the 


cham)  be  the  traffic  originating  from  the  nodes  internal  in  the 
cham  and  transmitted  to  the  external  nodes  (internal  traffic). 

Letting  S be  the  one  way  transit  traffic  in  the  chain,  it  is 
easily  seen  that: 
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S - 


f,  + f - Q 

1 n v 


The  chain  is  collapsed  when  the  transit  traffic  S is  predominant 
with  respect  to  the  internal  traffic.  More  specifically,  the 
chain  is  collapsed  if 


max 

i=l , . . . m1 


f.  - S 
1 

f . 

1 


- a , where  a is  an  input 

parameter  experimentally  adjusted. 

Although  the  above  criterium  is  not  completely  fail-safe 
(e.g.  is  possible  to  construct  pathological  examples  in  which 
chains  with  predominantly  internal  traffic  are  declared  collapsable 
by  the  test) , the  criterium  has  been  found  adequate  for  most  network 
design  applications. 


C • Link  Insertion  or  Upgrading 


In  order  to  select  the  link  to  be  inserted  or  upgraded, 
the  following  ratio  is  computed  for  each  existing  or  potential 
link  connecting  any  two  nodes  separated. by  the  cut  set: 


D'.  - D. 

pi  - cKif 


Where  CL  and  Dj_  are  capacity  and  cost  of  the  current  option  and 
Ci  and  are  the  values  corresponding  to  the  next  available  option, 
If  the  link  is  new,  Di  = CL  = 0;  if  the  link  cannot  be  expanded  any 
further,  00  • Special  attention,  and  slightly  more  elaborate 

expressions  are  used  in  the  case  of  chain  upgrading. 

After  having  computed  all  ratios  f i’ s , the  link  which 
minimizes  the  ratio  (i.e.  provides  the  lowest  incremental  cost  per 
banc  width)  is  introduced  or  upgraded  to  option  C' 

l ‘ 
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The  minimization  of  the  above  ratio  represents  the  basic 
criterium  for  link  insertion  or  upgrading.  Such  criterium  can  be 
improved  and  extended  to  account  for  the  effect  of  a new  link 
introduction  on  reliability  (e.g.  a link  which  eliminates  pendant 
nodes  or  long  chains  is  a favorite  candidate,  and  to  weigh  the 
capacity  increase  across  the  cut  set  with  the  amount  of 

capacity  actually  required  to  achieve  the  target  throughput  REMAX. 
These  and  several  other  versions  of  the  upgrading  criterium  are 
presently  being  experimented. 


D*  Link  Reduction  and  Removal 

The  criterium  for  link  reduction  (or  removal  is  based 
on  the  maximization  of  the  following  ratio  over  the  existing  links 

1-1 
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where : 


Ci  and  capacity  and  cost  of  lower  option  for  link  i. 
ACi  = Ci-Ci  : caPacity  reduction  on  link  i. 


i i "i  ' saving  corresponding  to  capacity 


ADn.  - D^-D^  : cost 

reduction  for  link  i. 


fj_  = current  flow  in  link  i 


Afi  = max  <fi  -Ci,0> 


a and  6 input  controlled  parameters  varying  between 


0 and  1. 
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Ratio  0 is  calculated  for  all  existing  links  except  for 
the  links  in  the  cut  set  and  the  link  which  maximizes  the  ratio 
is  reduced  or  removed.  Parameters  a and  3 are  experimentally 
adjusted  and  vary  between  0 and  1.  If  a=l  and  3=0  then  0.  = AD  /Af 

l l 

and  the  ratio  represents  the  cost  saving  per  unit  of  flow  that  must 
be  rerouted  on  other  links,  after  the  reduction  of  link  i.  By 
letting  a=3=l,  one  equally  penalizes  the  loss  of  capacity  and  the 
need  to  reroute  *low  on  link  i.  The  proper  selection  of  a and  3 
depends  on  problem  characteristics  (cost  structure,  delay  require- 
ments etc.)  and  can  be  done  only  experimentally. 

In  the  case  of  link  removal,  the  additional  criterium 
that  the  network  must  remain  2-connected  after  the  removal  is 
applied. 


E . Starting  Topologies 


The  determination  of  good  starting  topologies  to 
initialize  the  C S method  is  more  critical  in  the  case  of  multiple 
options  than  in  the  single  option  case.  In  fact,  the  multiple 
option  algorithm  is  "more  local"  than  the  single  one,  in  the  sense 
that  it  tends  to  upgrade  or  reduce  (rather  than  introduce  or  remove) 
links  and  therefore  it  does  not  produce  such  drastic  topological 
changes  as  the  fixed  capacity  option  algorithm.  Clearly,  the  locality 
can  be  corrected  by  properly  adjusting  the  criteria  for  link  in- 
sertion and  removal;  however,  some  degree  of  locality  will  always 
remain.  Therefore,  it  is  important  to  start  from  potentially 
good  topologies. 

An  effective  selection  of  starting  topologies  is  offered 
by  single  option  C S method  solutions  providing  throughput  and 
delay  performance  close  to  the  designed  requirements.  The  multiple 
option  CS  method  applied  to  such  topologies  can  thus  be  inter- 
preted as  a refinement  procedure  on  an  initial  fixed  capacity 
solution . 
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F.  The  Algorithm 

The  CS  algorithm  generates  low  cost,  2-connected  topologies 
with  throughput  performance  ranging  between  two  specified  through- 
put levels  REMIN  and  REMAX.  An  iteration  of  the  algorithm  consists 
of  the  following  fundamental  steps: 

1.  Solution  of  an  optimal  routing  problem  for  the 
current  topology  and  capacity  assignment,  and  determin- 
ation of  the  saturated  cut. 

2.  Topology  and/or  capacity  modification,  in  order 
to  improve  network  cost-effectiveness  and  to  drive  the 
solutions  to  the  desired  throughput  range.  In  partic- 
ular the  algorithm  performs  one  of  the  following  opera- 
tions : 


a.  Increase  only,  (i.e.  links  are  upgraded  or 
added) if  the  current  throughput  level  RE < REMIN ; 

b*  Reduce  only,  (i.e.  links  are  reduced  or  deleted) 
if  RE> REMAX; 

c.  Perturbation  (i.e.  one  link  reduction  (or 
deletion)  and  one  link  upgrading  (or  insertion)  are 
performed  simultaneously)  if  REMIN < RE < REMAX. 

3.  Acceptance  test.  If  the  new  solution  is  dominated 
by  previous  solutions,  or  it  is  not  cost-effective,  or 
it  is  a repeat  solution , the  algorithm  goes  back  to  step 
B and  selects  an  alternative  modification.  Otherwise 
it  returns  to  step  A. 

The  algorithm  is  initialized  by  providing  an  initial 
starting  topology,  and  produces  a new  network  configuration  at 
each  iteration. 
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V.  AN  APPLICATION 


As  an  application  of  the  multiple  capacity  CS  algorithm  „e 
consider  the  design  of  a 26  node  ARPANET  like  topology  using 
3 capacity  options  (9.6,  19.2  and  50  kb/s),  with  throughput 
requirement  ranging  between  REMIN  = 100  kb/s  and  REMAX  = 200  kb/s 
and  with  delay  requirement  Tmax  < .250  sec.  Traffic  demands  are 
symmetric  and  uniform  over  all  node  pairs. 

The  following  line  and  modem  costs  are  assumed  for  the  above 
mentioned  capacity  options: 


Cap 

kb/s 

Fixed  cobt 
$/mo 

Line  Cost 
$/mile  x mo 

9.6 

493 

.42 

19.2 

850 

2.50 

19.2  (using 
biplexers) 

1500 

.84 

50 

850 

5.00 

Biplexers  are  used  to  implement  our  19.2  kb/s  circuits  from  two 
.6  kb/s  circuits  in  parallel,  whenever  this  alternative  results 
more  economical  than  the  standard  19.2  AT&T  offering. 

Several  starting  solutions  were  considered.  Among  them  we 
mention  an  all  50  kb/s  configuration  shown  in  Figure  2;  and  the 
same  topology  as  in  Figure  2 but  with  all  9.6  kb/s  links.  The 
50  kb/s  solution  is  not  very  cost-effective  considering  our  through- 
put requirements.  The  9.6  kb/s  solution  is,  on  the  other  hand, 
inadequate  for  the  .250  sec  delay  requirement  (See  Figure  3). 

Starting  from  the  above  configurations  several  solutions  were 
generated.  The  most  representative  ones  are  shown  in  the 
throughput  vs.  cost  curve  plotted  in  Figure  3.  A typical  solution 
is  shown  in  Figure  4:  notice  that  most  of  the  short  connections 
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are  implemented  with  50  kb/s  circuits,  while  the  medium  and  long 
links  use  9.6  and  19.2  kb/s  circuits.  This  behavior  is  due  to 
the  particular  cost  structure  considered.  Figure  5 and  Figure  6 
show  two  CS  solutions  which  have  approximately  the  same  cost  and 
throughput  performance  but  have  different  connectivity.  The  first 
solution  is  more  attractive  from  the  high  bandwidth  point  of  view 
since  it  provides  a 50  kb/s  route  across  country;  the  second  is 
more  attractive  for  reliability. 

Notice  that  most  CS  solutions  are  regularly  aligned  along  a 
curve  which  represents  the  cos t-throughput  trend  for  the  26  node 
network  using  the  given  capacity  options.  The  same  trend  is  shown 
by  the  lower  bound  cost  solutions  obtained  by  solving  the  topo- 
logical problem  using  concave  instead  of  discrete  costs  (See 
Figure  7)  and  relaxing  the  2-connectivity  constraint.  The 
solid  curve  in  Figure  3 connects  such  lower  bound  solutions. 

Finally,  the  Concave  Branch  Elimination  (CBE)  method  described 
in  Section  2 was  applied  to  the  problem  and  two  solutions  are  shown 
in  Figure  3.  Notice  that  the  CBE  solutions  are  well  in  line  with 
the  CS  solutions,  although  their  topology  typically  shows  a higher 
degree  of  connectivity.  This  fact  allows  us  to  conjecture  that 
both  CS  and  CBE  solutions  are  near  optimal. 
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(Kb/s) 


Initial  50Kb/s  configuration 


CS  Solutions 
CBE  Solutions 
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FIGURE  5 : TOPOLOGY  OBTAINED  WITH  THE  C-S  METHOD 


The  CS  algorithm  is  an  important  contribution  to  the  topological 
design  of  packet-switched  networks,  in  that  it  is  computationally 
very  efficient  and  applicable  to  very  general  cost  structures. 
However,  more  research  is  required  in  this  and  other  areas  of  net- 
work design.  In  the  sequel  we  report  some  directions  for  future 
investigation. 

Present  topological  design  techniques  are  based  on  static  per- 
formance criteria  (e.g.  average  flow  and  delay)  and  do  not  take 


I 

' 

I 

E 


I 
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into  adequate  consideration  the  impact  of  topology  and  channel 
selections  on  some  of  the  important  aspects  of  network  operation 
(e.g.  adaptive  routing,  flow  control,  buffer  overflow,  traffic 
congestion  etc.)  it  is  important  therefore  to  further  investigate 
the  relationship  between  topological  characteristics  (such  as 
connectivity,  use  of  different  line  capacities,  etc.)  and  dynamic 
performance  criteria  (e.g.  network  controllability,  file  transfer 
bandwidth,  etc.)  and  to  include  dynamic  performance  considerations 
in  the  design  phase  by  using  appropriate  constraints  and  design 
criteria . 

Since  the  CS  algorithm  is  based  on  a variety  of  heuristic  steps, 
it  is  clearly  possible  to  improve  the  algorithm  by  refining  some  of 
the  steps.  Therefore,  research  should  be  pursued  on  issues  such  as: 
development  of  efficient  procedures  for  multiple  (rather  than  single) 
insertion,  deletion,  upgrading  and  reduction  of  links  at  each  CS 
iteration;  development  of  more  efficient  techniques  to  predict  the 
cost-throughput  effectiveness  of  each  topological  transformation, 
and  thus  select  the  most  effective  transformation;  inclusion  of 
more  precise  reliability  concepts  in  the  design  criteria,  etc. 

While  experimenting  the  CS  algorithm  with  multiple  capacity 
options  we  noticed  that,  although  the  algorithm  is  very  general  and 
can  deal  with  any  type  of  line  tariffs,  nevertheless  it  was  giving 
better  results  with  some  tariff  structures  than  others.  This  fact 
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suggests  that  it  is  possible  to  improve  the  performance  of  the 
algorithm  by  adjusting  the  heuristics  to  the  particular  cost 
structure  of  the  problem.  Clearly,  it  is  not  desirable  to  have 
around  several  design  programs,  one  for  each  cost  structure.  The 
designer  should  identify  the  steps  that  are  most  sensitive  to 
tariff  changes,  and  should  implement  the  computer  program  in  a 
modularized  fashion,  so  that  only  a few  modules  will  be  modified 
when  tariffs  change  [4] . 

Another  important  direction  for  future  research  is  the  design 
of  large  networks,  with  hundreds  or  thousands  of  nodes.  Such 
networks  cannot  be  directly  designed  using  traditional  algorithms, 
because  of  the  prohibitive  computer  time  and  memory  requirement. 

■*-he  typical  approach  consists  of  partitioning  the  network  into 
hierarchical  levels  and  applying  traditional  design  techniques 
within  each  partition.  Since  a large  network  will  contain  several 
partitions,  each  requiring  a separate  topological  design,  there 
is  the  need  for  very  efficient  design  algorithms.  Therefore,  the 
CS  algorithm  can  be  considered  an  important  contribution  to  large 
net  design.  The  next  important  steps  are:  the  development  of  good 
criteria  for  node  partitioning;  the  implementation  of  an  efficient 
data  base  to  store  the  input  parameters  and  intermediate  design 
results;  and  the  evaluation  of  global  network  performance.  Prelim- 
inary results  in  some  of  the  above  areas  are  already  available, 
and  are  reported  in  the  First  and  Second  semi-annual  NAC  reports 
[1]  and  [5]),  but  more  research  needs  to  be  done. 

Since  most  design  algorithms  are  based  on  heuristics,  it  is 
conceivable  that  such  algorithms  can  be  greatly  enhanced  with  the 
aid  of  interactive  graphics.  The  implementation  of  analysis  and 
design  programs  supported  by  interactive  graphics  can  be  useful 
in  two  ways : 

1.  It  can  assist  in  the  development  of  better 

heuristics,  since  it  allows  immediate  appraisal  of  the 
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affect  of  changes  in  parameter  values  and  design  criteria 
on  algorithm  performance; 

2.  It  offers  to  an  experienced  designer  the  possibility 
of  monitoring  the  network  design  process  iteration  by 
iteration,  and  of  correcting  eventual  inefficiencies 
of  the  algorithm. 


Considerable  research  efforts  have  been  dedicated  at  NAC  to 
the  development  of  interactive  graphics  software  for  network 
analysis  and  design  applications.  The  results  are  described  in 
other  sections  of  this  report  and  clearly  indicate  that  the  develop- 
ment of  interactive  design  programs  supported  by  a graphic  terminal 
is  another  very  important  direction  for  future  research  in  the  area 
of  network  design. 
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TERMINAL  ORIENTED  NETWORK  COST  AND  PERFORMANCE  - PART  3 
THE  ACCESS  FACILITY  LOCATION  PROBLEM 


INTRODUCTION 


In  any  network  with  a large  number  of  widely  dispersed  "users" 
accessing  a limited  number  of  "resources",  the  strategy  for  access 
will  play  a large  part  in  determining  the  cost  and  performance  of 
the  network.  The  "users"  may  include  not  only  time-sharing  terminals, 
but  also  terminals  used  for  message  transfers,  remote  automatic  sensing 
devices  (such  as  might  be  found  in  an  environment  monitoring  situ- 
ation) , manned  sensing  stations,  and  several  others.  The  "resources" 
may  be  as  sophisticated  as  many  heterogeneous  computers  tied  to- 
gether in  a packet  switching  high  level  subnet,  such  as  in  the 
ARPANET,  or  as  simple  as  a single  computer  processing  data  received 
from  automatic  remote  sensing  devices.  An  almost  endless  number 
of  "user"  and  "resource"  combinations,  both  covering  and  extending 
the  range  described  above,  appear  possible.  Effective,  economical 
"user"  access  will  depend  on  both  the  development  of  hardware  to 
facilitate  access,  and  the  development  of  network  and  topology 
design  techniques  to  effectively  utilize  such  hardware.  This 
chapter  continues  the  study  of  this  problem. 

There  are  several  ways  to  provide  access,  including  land  lines 
(i.e.  ordinary  telecommunication  channels  as  derived  from  cable  or 
LOS  microwave  transmission  systems)  and  packet  radio  techniques. 

In  this  chapter , advances  in  the  design  techniques  for  land  line 
approaches  are  reported;  advances  in  packet  ratio  techniques  are 
reported  in  other  chapters.  Cost  effective  land  line  access  will 
depend  on  an  effective  line  layout  algorithm  to  connect  "users"  with 
access  facilities,  development  of  hardware  to  serve  as  access 
facilities,  and  an  effective  facility  location  algorithm  to  deter- 
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mine  both  the  number  and  location  of  access  facilities.  The  line 
layout  problem  for  a given  set  of  access  facility  locations  has 
been  effectively  dealt  with  in  Part  1 in  Semiannual  Report  #1. 
Advances  in  hardware  serving  as  access  facilities  have  been  discussed 
in  Part  2 in  Semiannual  Report  #2.  In  this  part  the  development  of 
an  effective  facility  location  algorithm  is  reported. 

The  facility  location  problem  can  be  formulated  in  many  ways, 
and  can  appear  at  several  different  levels  within  the  same  network 
design  problem.  For  a network  such  as  the  ARPANET,  the  resources 
(hosts)  are  distributed  over  a large  geographical  region,  and  the 
users  (terminals)  may  connect  to  them  directly,  or  access  them 
through  the  packet  switching  subnet  by  being  connected  to  an  access 
facility  (such  as  a TIP  or  ANTS) , as  shown  in  Figure  1.  If  the 
resource  locations  and  user  locations  are  fixed,  as  shown  in  Figure 
2,  the  problem  becomes  one  of  locating  the  access  facilities  (TIP 
or  ANTS)  to  minimize  the  cost,  subject  to  capacity,  performance, 
and  reliability  constraints.  When  the  locations  of  these  facilities 
(TIP  or  ANTS)  are  fixed,  the  problem  may  appear  at  the  level  of  using 
concentrators  or  multiplexers  as  "access  facilities"  for  connecting 
the  users  to  these  new  "resources,"  (i.e.  the  fixed  TIP  or  ANTS 
locations) . Because  the  facility  location  problem  can  be  posed  for 
various  levels  of  the  same  design  problem,  with  devices  performing 
different  "functions"  depending  on  context,  we  will  pose  the  problem 
in  terms  of  a generic  access  facility,  called  a TACOM  (TIP,  ANTS, 
concentrator  or  multiplexer) , and  a generic  resource  to  which  users 
and/or  TACOMs  are  to  be  connected,  called  a RESCOP  (resource 
connection  point) . We  will  also  use  the  term  "node"  as  a generic 
replacement  for  "user".  Thus,  the  general  problem  is  one  of  locating 
TACOMs  to  most  economically  allow  connection  of  a given  set  of 
nodes  to  a given  set  of  RESCOPs. 

The  most  basic  formulation  of  this  problem  is  when  there  is 
only  one  RESCOP  location,  and  the  nodes  may  be  connected  directly 
to  the  RESCOP,  or  through  TACOMs,  which  are  in  turn  connected 
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directly  to  the  RESCOP.  The  recent  advances  reported  here  are» 
for  simplicity,  presented  in  the  context  of  this  basic  formulation. 
After  presentation  of  the  basic  algorithm  and  discussion  of  its 
performance  characteristics,  we  consider  its  application  to  the  more 
general  problem  with  multiple  RESCOP  locations. 

In  Section  II  a simple  description  of  the  problem  is  developed. 
In  Section  III,  related  problems  and  solution-techniques  are 
described,  and  in  Section  IV  previous  approaches  to  this  problem 
are  considered.  in  Section  V,  a simplified  description  of  our 
approach  is  given,  and  in  Section  VI  a detailed  description  of  the 
algorithm  is  presented.  Results  of  experiments  to  determine  the 
performance  of  the  algorithm  are  given  in  Section  VII,  and  extensions 
to  more  general  problems  are  given  in  Section  VIII. 
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II . BASIC  PROBLEM  STATEMENT 

The  basic  problem  may  be  posed  as  that  of  a geographically 
distributed  set  of  nodes  that  must  be  connected  to  a RESCOP  (Re- 
source connection  point) . The  most  primitive  topology  for  such  a 
network  is  shown  in  Figure  3 (a) . Each  node  is  connected  directly 
to  the  RESCOP.  There  is  no  topological  design  effort  associated  with 
this  network.  It's  cost  is  easily  expressed  as 

N 

coststar  = l “i<i> 

i=2 

where  d^  (j)  is  the  cost  of  joining  node  j to  node  i,  and  the  RESCOP 
has  been  made  node  1. 

A more  sophisticated  design  alternate  is  shown  in  Figure  3 (b) . 
In  this  case,  TACOMS  (TIP,  ANTS,  concentrators,  or  multiplexers) 
have  been  used  to  save  cost.  Here  the  topological  design  effort  is 
directed  at  selection  of  number,  and  location  of  TACOMs,  and 
assignment  of  nodes  to  the  TACOMs,  so  as  to  minimize  cost.  The 
TACOMS  have  a cost,  D,  and  a cost  of  being  connected  to  the  RESCOP 
when  located  at  node  i,  d1c(i),  which  may  be  different  than  the 
node  connection  cost  (due  to  the  possibile  bandwidth  requirements 
difference) . Consider  TACOMs  at  the  two  nodes  k and  1 , and  let 
Bk  and  B;„  be  the  set  of  nodes  which  are  assigned  to  k and  l,  re- 
spectively, with  A the  set  of  all  nodes.  The  cost  of  the  star-TACOM 
network  may  then  be  expressed  as 

COSTSTAR-TACOM  = ^ dk  + D + di  J 

i£Bk 

+ [i  d£  (i)  + D + d^  (i)  ] 

ieBn 

+ l (i) 

U(A-w 
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Let  C be  the  set  of  TACOM  sites,  with  CCA.  The  design  problem  may 
then  be  expressed  as: 

Select  C and  B_.  , j eC  so  as  to  minimize 

COSTSTAR-TACOM  = l [ l dj  (i)  + D + d^  (j)  ] 

jeC  ieB_, 

+ l dx  (i) 

id  A - U B.) 

3 

j eC 

When  nodes  do  not  have  to  be  connected  directly  to  the  host  or 
a TACOM,  but  rather  can  shaie  a line  (such  as  in  a polling  situ- 
ation) , considerable  cost  savings  can  result  from  proper  line  layout, 
as  shown  in  Figure  3(c).  If  there  are  no  constraints  on  the  number 
of  nodes  which  can  share  a line,  then  the  optimal  topology  is  simply 
a minimum  spanning  tree.  When  constraints  are  present,  the  design 
problem  becomes  more  interesting.  This  problem  was  effectively 
formulated  and  dealt  with  in  Part  1 in  Semiannual  Report  #1. 

A third,  more  general,  and  often  most  effective,  design  alter- 
native is  the  combined  use  of  multidrop  lines  and  TACOMs,  as  shown 
in  Figure  3 (d) . in  this  case,  the  problem  is  one  of  total  network 
design,  including  both  selection  of  TACOM  locations  and  line  layout. 
The  other  design  alternatives  may  be  considered  as  subsets  of  this 
alternative . 

^ ■*'s  this  general  problem  that  is  considered  here.  There 
are  many  possible  formulations  of  this  problem,  depending  on  the 
selected  constraints  and  cost  functions.  We  present  our  basic 
results  in  terms  of  the  simplest  of  these  formulations,  and  then 
in  a later  section  consider  the  many  possible  extensions.  This 
formulation  is  given  below: 
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Given : 

A 

- set  of  nodes  i = 2,...N 

node  1 

- RESCOP  site 

H 

- set  of  possible  TACOM  sites 

w 

max 

- line  capacity 

c 

max 

- TACOM  capacity 

dx(j) 

- cost  of  connecting  node  j to 

charge 

- cost  of  TACOM 

FIND:  Low  cost  network  design  subject  to 

the  constraints: 

A)  No  line  may  have  more  than  w nodes. 

max 

B)  No  TACOM  may  serve  more  than  c nodes. 

max 

In  this  formulation  the  set  of  possible  TACOM  sites,  H,  is  defined 
separately  from  the  set  of  nodes,  A.  In  most  practical  problems  the 
possible  TACOM  sites  are  limited  to  a subset  of  the  nodes  selected 
on  considerations  of  maintenance,  rental  space,  access  by  trained 
company  personnel,  security,  etc.  However,  it  is  quite  feasible 
for  situations  to  occur  where  the  possible  TACOM  sites  are  in  fact 
disjoint  from  the  nodes  (such  as  in  a commercial  time  sharing  oper- 
ation where  TACOMs  must  be  at  company  provided  locations  but  all 
terminals  are  at  customer  locations) , partially  overlap  with  the 
set  of  nodes,  are  a proper  subset  of  the  nodes,  are  the  same  as  the 
nodes,  or  have  the  nodes  as  a proper  subset.  For  simplicity,  we 
have  chosen  to  present  the  algorithm  for  the  case  where  the  nodes 
are  the  possible  TACOM  sites,  thus  dealing  with  only  one  set,  A.  In 
a later  section  the  algorithm  will  be  also  shown  to  easily  handle 
the  other  cases  noted  above. 
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HI.  RELATED  PROBLEMS  AND  SOLUTION  TECHNIQUES 

There  are  many  problems  that  are  related  to  the  TACOM  location 
problem,  including  warehouse  location  problems,  clustering  problems, 
and  partitioning  problems.  In  this  section  we  discuss  these 
problems,  their  solution  techniques,  and  the  applicability  of 
these  techniques  to  the  TACOM  location  problem. 

A.  Warehouse  Location  Problems 


The  warehouse  location  problem  may  be  briefly  defined 
as  the  determination  of  the  number,  location,  and  capacity  of 
source  sites  in  order  to  minimize  the  cost  of  satisfying  a set 
of  shipping  requirements  under  a given  cost  matrix  [6] . Efrovmson 
and  Ray  present  a BranchBound  algorithm  solution  technique  for 
this  oroblem  when  it  is  formulated  in  a manner  analogous  to  the 
TACOM  location  problem  with  a pointtopoint  connection  constraint 
[8].  Relative  to  the  general  TACOM  location  problem,  this  approach 
has  two  drawbacks:  it  does  not  easily  handle  the  multidrop  line 
case,  and,  for  large  problems,  the  computational  requirements 
are  prohibitive. 

The  computational  requirements  for  obtaining  exact 
solutions  to  these  problems  have  given  rise  to  many  heuristic 
approaches.  Among  the  more  successful  is  the  "Add1  algorithm 
[21] . In  this  approach  a star  network  with  all  "customers"  directly 
connected  to  a central  warehouse  is  assumed  to  start  with.  Each 
possible  warehouse  location  is  then  evaluated  by  determining 
the  cost  reduction  which  would  be  achieved  by  placing  a warehouse 
at  the  location.  The  location  giving  the  greatest  reduction  is 
then  selected  for  the  first  placement  of  a warehouse.  With  the 
new  warehouse  in  place,  the  process  is  repeated  for  the  next 
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location.  When  no  further  cost  reduction  is  achieved,  the  process 
halts.  Relative  to  the  TACOM  location  problem,  this  approach  also 
has  the  drawback  of  not  handling  the  multidrop  line  case  efficiently; 
evaluating  each  site  with  a complete  line  layout  would  be  too 
computationally  costly.  In  addition,  each  time  a site  is  selected, 
it  becomes  permanent,  but  there  is  little  reason  to  think  that  the 
location  of  the  best  single  TACOM  is  also  one  of  the  best  locations 
for  two  TACOMS,  etc. 

The  "Drop"  algorithm  [10]  is  basically  the  reverse  of 
the  "Add"  algorithm.  Warehouses  are  initially  assumed  to  be  at 
all  possible  locations,  and  "customers"  are  assigned  to  multiple 
warehouses.  The  procedure  is  to  then  eliminate  the  warehouse  whose 
elimination  most  reduces  the  cost.  The  process  is  repeated  until 
no  further  reduction  is  possible.  The  difficulties  here  are 
directly  analogous  to  those  of  the  "Add"  algorithm. 

Other,  less  promising,  formulations  of  the  warehouse 
location  problem  and  related  solution  techniques  are  contained  in 
[26],  [13],  [9],  [14]. 


B . Clustering  Problems 

Basic  aspects  of  the  TACOM  location  problem  can  be  posed 
in  a clustering  context.  The  clustering  problem  may  be  thought  of 
as  detecting  inherent  separations  between  subsets  of  a given  point 
set,  where  the  separations  may  be  in  terms  of  several  different 
measures.  In  the  TACOM  location  problem,  one  might  expect  some 
measure  to  be  available  such  that  nodes  clustered  under  this 
measure  would  most  appropriately  share  a common  line , or  be 
connected  to  the  same  TACOM.  Many  clustering  techniques  exist 
[3],  [23],  [15],  [1],  [22];  however,  few  appear  to  suggest 

measures  which  may  be  applicable  to  the  TACOM  location  problem. 
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Among  the  more  promising  are  those  which  attempt  to  cluster  points 
on  a plane  [28],  [17].  Zahn , [28]  attempts  to  identify  gestalt 

clusters  (twodimensional  point  sets  naturally  perceived  as  separate 
groupings)  by  connecting  the  points  with  a minimum  spanning  tree, 
and  deleting  relative  long  branches  to  form  components,  or  clusters. 
Jarvis  and  Patnik  [17]  offer  a similarity  measure  based  on  shared 
near  neighbors  to  generate  "nonglobular ” clusters.  This  technicme 
and  that  of  Zahn  have  much  in  common.  However,  neither  of  the 
approaches  appear  to  offer  more  than  insight  to  the  complexity 
of  the  TACOM  location  problem. 


Partitioning  Problems 


The  TACOM  location  problem  may  be  viewed  as  that  of 
partitioning  a set  of  elements  in  a way  that  optimizes  some  ob- 
jective function  defined  on  the  set  of  all  partitions.  The  par- 
titioning may  be  in  terms  of  nodes  sharing  a common  line,  or 
in  terms  of  nodes  connected  to  the  same  TACOM,  and  the  objective 
function  is  simply  network  cost.  There  are  several  formulations 
to  the  partitioning  problem,  in  terms  of  set  theory  [12] , [24] , 

and  graph  theory  [19] , [20]  , [18] . Perhaps  the  most  interesting 

of  these  formulations,  from  the  TACOM  location  perspective,  is 
the  formulation  by  Roach  [24]  in  which  the  objective  function 
is  to  minimize  the  maximum  within-cluster  distance.  Although 
the  solution  technique  offered  is  one  of  forming  reduced  subproblems, 
and  is  somewhat  improved  over  integer  programming,  it  is  still 
far  too  computationally  complex  to  be  of  interest  in  the  TACOM 
location  problem. 
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IV.  PREVIOUS  APPROACHES 


The  TACOM  location  problem  has  received  considerable  attention 
in  the  data  communication  network  literature  [7],  [2],  [11], 

51<  [5],  [16],  [4],  [27],  The  approaches  have  almost  all 

been  heuristic  in  nature,  and  most  have  dealt  with  only  the  less 
general  version  of  the  problem  where  only  point-to-point  connections 
are  Permitted.  For  this  simpler  problem,  two  attractive  approaches 
seem  to  be  a direct  adaptation  of  the  "Add"  algorithm  of  the  ware- 
house location  problem,  and  a "graceful”  drop  algorithm  based 
on  link  removal  rather  than  TACOM  removal  [2],  The  former  seems 
to  be  computationally  considerably  more  efficient,  but  the  latter 
to  yield,  in  general,  slightly  better  results.  An  approach  based 
on  combinatorial  optimization  over  certain  selected  subsets  of 
the  network  has  been  offered  by  Greenburg  [16],  and  has  the  attrac- 
lon  of  partitioning  the  main  problem  into  subproblems  in  an 
iterative  manner,  and  reoptimizing  the  partitions  based  on  global 
considerations.  The  original  set  of  node  locations  is  partitioned 
Y irst  locating  the  pair  of  nodes  closest  together  as  a kernal 
° 3 partltlon  dement.  A point  whose  distance  from  the  kernal 

gives  a cost  of  connection  to  the  kernal  greater  than  the  cost  of 
locating  a TACOM  at  the  point  is  called  an  opposing  point.  The 
nernal  is  then  iteratively  enlarged  by  adding  the  point  closest 
o the  kernal  which  is  closer  to  the  kernal  than  to  any  opposing 
point.  The  heuristic  employed  here  is  that  such  points  are  more 
likely  to  be  optimally  connected  to  the  same  TACOM  as  the  points 
in  the  kernal  than  to  be  connected  to  a TACOM  at  one  of  the  opposing 
poin  s.  When  no  additional  nodes  can  be  added,  the  enlarged  kernal 
ecomes  a partition  member  and  is  removed  from  further  consideration  ln 
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this  initial  partitioning  phase.  A new  kernal  is  then  selected, 
and  the  process  repeated  to  find  a second  partition  element.  The 
procedure  is  continued  until  no  further  partitioning  is  achieved. 

The  optimal  location  of  TACOMs  within  each  member  set  of  the 
partition  is  achieved  by  exhaustive  combinatorial  enumeration. 
However,  optimality  over  each  member  set  of  the  partition,  does  not 
necessarily  give  optimality  over  the  entire  set.  Hence,  an 
iterative  scheme  of  repartitioning  and  reoptimizing  is  next 
pursued.  Nodes  are  optimally  assigned  to  the  TACOMs  originally 
found  (by  assigning  each  node  to  its  closest  TACOM)  , thereby 
creating  a new  partition  of  single  TACOM  sets.  The  optimal  TACOM 
location  within  each  set  is  then  found  by  exhaustive  search. 

Pairwise  combinations  of  the  sets  are  then  optimized  for  one,  two, 
or  three  TACOMs,  again  by  combinatoric  enumeration.  If  different 
TACOM  locations  result  in  the  combined  set,  a new  Partitioning 
is  done  by  optimally  assigning  nodes  to  TACOMs.  The  process  is 
repeated  until  no  pairwise  revisions  occur. 

The  difficulties  here  begin  with  the  strategy  for  the  par- 
titioning. There  is  no  control  over  the  size  of  the  partition 
sets,  with  a possible  result  being  the  original  set  itself.  For 
a set  of  N possible  TACOM  locations,  combinatoric  enumeration  of 
all  possible  locations  takes  2^-1  operations,  where  N is  the 
number  of  locations.  Each  combination  must  have  the  nodes 
optimally  assigned  and  its  cost  evaluated.  Clearly,  for  large 
sets  in  the  partition  this  is  computationally  prohibitive.  The 
iterative  process  of  pairwise  combination  of  partition  elements 
and  optimizing  TACOM  locations  over  the  combined  pair  is  attractive, 
but  the  enumeration  form  of  the  optimization  is  again  costly.  All 
of  the  above  difficulties  are  compounded  considerably  when  exten- 
sion to  multidrop  lines  is  considered. 
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The  general  TACOM  location  problem  has  been  attacked  by  Woo 
and  Tang  [27],  with  an  algorithm  that  approximates  the  problem 
with  a simpler  point-to-point  problem,  in  which  the  point-to-point 
cost  of  connecting  two  nodes  is  a weighted  average  of  the  direct 
connection  cost  and  the  shared  cost  of  connecting  through  the 
minimum  spanning  tree.  After  the  problem  simplification,  the 
"Add"  algorithm  is  applied  to  make  site  selections.  Recall  that 
in  the  "Add"  algorithm,  all  candidate  sites  are  evaluated  for 
savings,  the  best  selected  and,  with  assigned  nodes,  removed  from 
further  consideration.  The  procedure  is  iterated  until  no  new 
site  is  found  to  give  positive  savings.  A vigorous  proof  of  the 
following  theorem  which  justifies  the  termination  condition  for 
the  "Add"  algorithm  has  been  reported  [27]. 

Theorem 

Let  L,  , and  be  optimal  network  assignments  corresponding 

respectively  to  TACOM  sets  (j  , j . . .,  j ) , (j  , . . . , j . , j ) , and 

CJi.  K O 11  + 1 

( jQ#  • • • » ji  ji  + 1,Ji+2)  , and  let  C (L)  , C (L^  , and  C (L12)  be  their 
respective  costs.  Then  C (L)  - C (L^ ) > C (L^  - C (L^)  • 

Note  that  this  theorem  does  not  say  that  the  network  cost  is  a 
convex  function  of  the  number  of  TACOMs,  but  rather  that  the  cost  is 
convex  over  the  iterations  of  adding  a new  TACOM  to  an  existing 
set  of  TACOMs  without  reoptimization  of  the  existing  TACOM  locations. 
In  fact,  counterexamples  are  easily  produced  in  which  the  cost  is 
not  convex  over  the  number  of  TACOMs  when  the  given  number  of  TACOMS 
are  optimally  located  in  each  case. 

The  Average  Tree-Direct  (ATD)  algorithm  described  above  appears 
to  be  the  best  algorithm  for  the  general  TACOM  location  problem 
that  is  currently  in  the  literature.  We  will  use  it  as  the  basis 
for  comparative  evaluation  of  the  algorithm  we  present  next. 
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V.  DESCRIPTION  OF  GENERAL  APPROACH 

I 

I 

i 

There  are  many  possible  formula  tion:’,  for  the  TACOM  location 
problem,  and  many  possible  approaches  to  finding  acceptable 
solutions.  We  present  here  an  approach  to  the  following  formulation 
of  the  problem: 

Given : 


• Set  of  nodes, 

• Particular  node  which  is  RESCOP 

• Constraint  on  number  of  nodes  which  may 
share  a line, 

• Constraint  on  number  of  nodes  a TACOM 
may  serve, 

• Cost  of  connecting  nodes, 

• Cost  of  TACOM 


Find : 

I 

• A low  cost  feasible  design  in  which 
TACOMS  may  be  used. 

7 

The  object  of  this  formulation  is  a total  network  design.  The 
approach  is  heuristic,  and  consequently,  no  promise  is  made  of 
optimality;  only  feasibility  in  terms  of  the  given  constraints, 
ihe  constraints  used  in  this  formulation  were  chosen  because  they 
are  often  appropriate  in  reality,  and  are  particularly  simple.  The 
approach  is  also  reasonable  for  for  various  other  constraints.  These 
will  be  discussed  later.  TACOMS  are  used  only  where  they  appear  to 
be  beneficial,  and  consequently,  for  some  node  arrangements,  designs 
will  result  which  use  no  TACOMS. 

The  general  approach  is  characterized  by  the  following  four 
steps . 
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1-  Simplify  the  problem  to  a point-to-point 

problem  by  replacing  clusters  of  nodes  by  single  ”center-of- 
mass"  (COM)  nodes. 

2.  Partition  the  reduced  set  of  COM  nodes  by  applying  an 

aad  algorithm,  resulting  in  one  of  the  COM  nodes  selected  as 
a TACOM  site. 

Select  one  of  the  original  nodes  as  a real  TACOM  site  in 
each  partition  by  examining  the  original  nodes  closest  to  the 
COM  node  selected  in  the  add  algorithm,  and  selecting  the  best. 

4.  Apply  a line-layout  algorithm  to  each  partition,  with 
its  selected  TACOM  site  serving  as  the  central  node. 

A Simplified  flow  chart  for  the  approach  we  use  is  shown  in  Figure  4 

live  lnterPret  tUS  £l0“  Chart  ^ °£  “e  £°Ur  Ste-  sLn 


A. 


Simplification  by  Clustering 


Simplification  is  achieved  when  the  problem  is  reduced 
in  size  and  converted  to  a point-to-point  formulation.  To  accom- 
plish this,  clusters  of  nodes  are  replaced  by  single  nodes. 

The  clusters  are  intended  to  reflect  natural  groupings  of  nodes 
that  can  be  most  appropriately  approximated  by  single  nodes 
at  their  center  of  mass.  The  clusters  are  limited  in  size  by 
the  line  constraint,  and  thus  also  reflect  possible  groupings 
of  nodes  to  share  a line. 

The  clusters  are  formed  by  "rolling  snowballs"  in 
a rather  "balanced"  fashion.  First,  the  two  nodes  closest  to- 
gether are  selected  [2].  if  these  two  nodes  can  be  put  in 
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the  same  cluster  (i.e.,  their  being  joined  does  not  violate  the 
line  constraint)  [3],  they  are  replaced  by  a single  node  at  their 
center-of-mass  [4],  called  a COM  node.  The  "weight"  of  the  nodes 
are  simple  the  number  of  nodes  contained  in  the  cluster  they 
represent,  with  initially  all  weights  equal  to  one. 

If  the  two  nodes  can  not  be  merged,  the  next  closest 
together  "pair"  will  be  considered.  As  "pairs"  of  nodes  are 
identified  which  can  not  be  merged,  they  are  removed  from  further 
consideration  in  the  clustering  process  [5],  although  the  indi- 
vidual nodes  may  reappear  as  members  of  other  "pairs".  The 
clustering  process  continues  until  no  "pair"  of  nodes  can  be 
effectively  merged  [6].  An  example  of  the  clustering  process 
is  shown  in  E’igure  5.  At  this  point,  the  original  set  of  nodes 
has  been  replaced  by  the  set  of  nodes  representing  the  clusters. 
This  set  is  smaller  in  number  by  a factor  slightly  less  than  the 
line  constraint,  and  the  relative  costs  of  connecting  the  nodes 
in  a cluster  to  different  sites  can  be  approximated  by  the 
point-to-point  costs  of  connecting  their  representative  node. 
Thus,  the  problem  is  reduced  in  size  and  converted  to  a point- 
to-point  form.  An  example  of  the  reduced  set,  with  associated 
weights  for  the  nodes,  is  shown  in  Figure  6. 

B.  Partitioning 


The  add  algorithm  examines  the  benefit  of  placing  a TACOM 
at  each  COM  node.  The  benefit  is  determined  by  iteratively  asso- 
ciating with  each  COM  node  the  other  COM  nodes  which  give  the 
greatest  cost  benefit  by  being  connected  to  the  TACOM  instead  of 
the  RESCOP,  subject  to  the  TACOM  capacity  constraint.  After 
the  assignment  of  COM  nodes  to  the  TACOM,  a heuristic  esti- 
mate, explained  in  detail  later,  is  made  of  the  cost  benefit. 
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The  estimate  incorporates  the  weights  of  the  nodes,  and  is  dif- 
ferent than  the  simple  cost  gain  of  the  COM  as  a TACOM  in  a 
point-to-point  case.  The  COM  which,  as  a TACOM  site,  has  the 
greatest  heuristic  benefit  estimate  is  selected  c.s  the  best 
[7] . The  simple  point-to-point  cost  gain  of  this  COM  node  is 
then  checked  [8].  If  the  cost  gain  is  positive,  the  selected 
COM  node  and  all  those  COM  nodes  assigned  to  the  selected  loca- 
tion in  the  add  algorithm  are  partitioned  from  the  remaining 
nodes  to  form  a separate  subproblem  [9],  as  shown  in  Figure  6. 
If  the  cost  gain  is  not  positive,  it  is  concluded  that  the  best 
TACOM  site  is  not  cost-effective,  and  all  the  remaining  COM 
nodes  are  then  assigned  to  the  RESCOP  [13].  This  forms  a last 
partition  element  to  be  treated  as  a subproblem. 


c • Local  Optimization 


The  output  of  the  partition  process  is  a particular  sub- 
set of  COM  nodes  identified  as  deserving  a TACOM,  and  a particular 
member  of  the  subset  identified  as  the  appropriate  site  for  the 
TACOM.  However,  the  COM  nodes  are  representatives  of  clusters, 
and  in  reality  some  particular  original  node  site  must  be  se- 
lected for  the  TACOM.  To  select  this  site,  each  of  the  k nodes 

closest  to  the  selected  COM  node  are  evaluated,  as  shown  in  Figure 
7,  with  k = 3. 


The  same  measure  is  used  in  the  evaluation  es  is  used 
in  the  COM  node  selection.  The  node  with  the  greatest  heuristic 
estimate  of  benefit  is  then  selected  as  the  actual  TACOM  site 
[10].  This  gives  a partition  of  the  nodes  into  a subproblem 
complete  with  an  actual  node  site  chosen  for  the  TACOM. 
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D . Line  Layout 

In  order  to  apply  a line  layout  algorithm  to  the  sub- 
problem formed  above,  it  is  first  necessary  to  replace  all  the 
COM  nodes  by  the  actual  nodes  they  represent.  This  gives  a 
partition  of  actual  nodes  with  an  actual  node  selected  as  the 
TACOM  site.  A line  layout  algorithm  is  then  applied  to  the 
partition  [11],  giving  a result  as  shown  in  Figure  8.  The 
nodes  in  this  partition  are  then  removed  from  further  considera- 
tion. If  no  nodes  are  left  [12],  the  design  is  complete  [14]. 

If  nodes  remain,  the  process  is  repeated  [15].  The  process  is 
repeated  until  all  nodes  are  assigned  to  TACOMS  or  until  no 
no  additional  TACOM  is  estimated  to  be  beneficial  [8] . In  this 
latter  case,  the  remaining  nodes  are  assigned  to  the  RESCOP 
[13],  and  the  line  layout  algorithm  is  applied  [11]  to  complete 
the  design,  as  shown  in  Figure  8. 
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VI.  THE  CENTER-OF-MASS  ALGORITHM 


In  this  section,  we  give  a precise  description  of  the  algorithm 
outlined  above.  The  algorithm  is  formally  stated  at  the  end  of  the 
section.  The  first  part  of  the  section  is  devoted  to  interpretating 
the  formal  statement. 

The  formal  statement  begins  with  a list  of  the  relevant  defini- 
tons,  parameters,  constants,  constraints,  and  cost  function.  Note 
that  the  constraints  are  of  the  simple  capacity  variety,  and  the 
cost  function  for  connecting  two  nodes  is  simply  the  Euclidean 
distance  between  the  nodes.  It  will  be  clear  after  examination  of 
the  algorithm  that  other  constraints  and  cost  functions  are  equally 
usable . 

In  the  initialization  phase  (Step  0) , A is  formed  as  the  set 

of  N - 1 nodes  of  interest,  where  the  RESCOP  (node  1)  is  not  included 

The  RESCOP  is  not  considered  eligible  for  a TACOM  as  nodes  may 

connect  to  it  directly.  The  set  DO  is  the  set  of  all  nearest 

neighbor  distances.  For  each  node  i a list  Li  is  kept  of  the  real 

nodes  represented  by  the  node.  Initially,  the  list  for  each  node 

contains  only  the  node  itself.  After  two  nodes  merge,  the  list 

for  the  new  node  will  contain  the  members  of  the  lists  of  the  two 

nodes  that  merged,  thereby  keeping  track  of  the  represented  nodes. 

The  weight  assigned  to  each  node  is  initially  set  equal  to  one. 

Thus,  the  constraints  are  simply  on  the  number  of  nodes  per  line 

and  the  number  of  nodes  per  TACOM.  For  each  node  i,  its  nearest 

neighbor  j is  represented  as  n^.  The  average  "nearest  neighbor 

distance",  d , is  used  to  set  a maximum  distance  d over  which 
avg  max 

nodes  may  be  merged.  The  parameter  a determines  this  maximum 
distance.  The  reason  for  establishing  such  a maximum  distance  is 
based  on  the  clustering  objective  and  procedure,  as  discussed 
below. 
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The  clustering  objectve  is  to  replace  natural  groupings  of 
nodes-J^y  a single-node.  The  procedure  is  to  iteratively  merge  the 
closest  feasible  pair  of  nodes.  As  this  procedure  nears  completion, 
most  nodes  are  representative  of  near  capacity  clusters,  and  con- 
sequently, cannot  be  merged.  Thus,  the  shortest  distance  between 
feasible  nodes  may  become  quite  large.  Merging  of  these  nodes  would 
not  reflect  the  objective  of  natural  groupings.  The  maximum  distance 
is  designed  to  prevent  such  mergings. 

After  determining  a maximum  allowable  distance,  the  set  D is 
formed  as  the  set  of  allowable  nearest  neighbor  distances. 

The  merging  of  two  nodes  is  accomplished  in  Step  1 of  the  al- 
gorithm. Prior  to  entering  this  step,  it  is  known  that  all  nodes 
v;ith  their  nearest  neighbor  distance  contained  in  D can  be  feasibly 
merged  with  their  nearest  neighbor.  When  Step  1 is  entered  after 
the  initialization  process,  this  is  certainly  true,  or  else  the 
problem  .is  point-to-point  in  form,  and  not  multipoint.  Following 
Step  1 is  an  update  procedure  designed  to  ensure  that  this  is  true 
when  the  step  is  reentered. 

The  first  task  in  the  merge  process  is  selection  of  the  minimum 
nearest  neighbor  distance.  Then  a new  node  is  formed  with  its  loca- 
tion at  the  center-of-mass  of  the  two  nodes  being  merged,  where 
the  weights  used  in  the  center-of-mass  calculation  are  simply  the 
weights  assigned  to  each  node.  The  list  of  real  nodes  represented 
by  the  new  node  is  formed  by  merging  the  lists  of  the  two  old  nodes. 
The  old  nodes  are  then  removed  from  further  consideration  by  deleting 
them  from  the  set  A.  The  new  node  will  be  added  to  the  set  A later. 
The  weight  of  the  new  node  is  simply  the  sum  of  the  weights  of  the 
old  nodes.  Note  that  with  the  initial  weights  all  set  equal  to  one, 
the  weight  is  simply  the  number  of  real  nodes  represented  by  the 
new  node. 

After  a new  node  is  formed,  its  nearest  neighbor  must  be  found, 
and  all  nodes  which  had  one  of  the  old  nodes  as  a nearest  neighbor 
must  also  have  new  nearest  neighbors  found.  This  is  accomplished  in 
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the  update  procedure  (Step  2) . D1  is  the  subset  of  D containing 
all  the  distances  for  which  the  nearest  neighbors  are  not  one  of 
the  two  nodes  just  merged  in  the  previous  step.  A2  is  the  set  of 
all  nodes  whose  nearest  neighbors  were  one  of  the  two  nodes  just 
merged,  plus  the  new  node;  i.e.,  the  set  of  all  nodes  for  which 
new  nearest  neighbors  are  to  be  found.  The  new  node  is  added  to 
the  set  A,  making  it  again  the  set  of  all  nodes  which  are  candidates 
for  merging.  The  nearest  feasible  neighbor  distances  are  then  found 
for  the  nodes  which  need  new  distances,  i.e.,  the  members  of  A2. 

This  is  the  set  D2.  D3  is  the  subset  of  D2  containing  distances 
less  than  the  maximum  allowed  distance.  For  each  node  i,  which  has 
a new  feasible,  allowable  distance  neighbor  j,  n.  is  defined  as  j. 

The  set  D of  all  allowable  nearest  neighbor  distances  is  then  formed 
by  combining  the  unchanged  distances,  Dl , with  the  new  distances  D3. 
Note  that  all  members  of  D are  defined  for  pairs  of  nodes  which  are 
feasible  to  merge.  If  D is  not  empty,  then  the  merging  process 
is  continued.  If  d is  empty,  no  further  merging  is  possible,  and 
the  add  process  is  entered  (Step  3) . 

The  first  task  of  the  add  algorithm  is  the  iterative  examin- 
ation of  each  node  for  the  possible  benefit  of  locating  a TACOM  at 
its  site.  This  task  is  accomplished  in  Step  3.  The  savings  achieved 
by  connecting  a node  j to  a TACOM  at  i versus  the  RESCOP  ( node  1) 


is  defined  as  s^.  For  each  node  i,  nodes  are  iteratively  associ- 
ated with  the  node  in  order  of  decreasing  savings,  with  the  maximum 


savings  node  associated  first.  The  iterations  continue  until  no 
further  savings  are  possible,  or  until  the  capacity  constraint 
prevents  any  further  associations.  When  the  iterative  process  is 
terminated,  the  benefit  of  placing  a TACOM  at  the  node  site  is 
evaluated  on  the  basis  of  the  associated  nodes.  Thus,  for  each 
node,  i,  the  iterative  process  is  initiated  by  forming  A as  the  set 


of  candidate  nodes  for  connection  to  i,  and  B.  as  the  list  of  nodes 


actually  to  be  associated  (initially  empty).  T is  the  running  sum 
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of  weights  of  the  associated  nodes,  used  to  check  the  capacity 
constraint. 

Part  A)  is  the  iterative  process  of  association.  If  all 
candidates  have  been  examined,  then  the  evaluation.  Part  B) , is 
commenced.  If  there  are  still  unchecked  candidates,  select  the 
one  with  the  greatest  savings.  If  the  savings  are  not  positive,  no 
further  savings  are  possible,  and  commence  the  evaluation.  Other- 
wise, remove  the  selected  node  from  the  candidate  set  to  ensure 
that  it  is  not  selected  a second  time,  and  then  determine  if  the 
association  is  consistent  with  the  constraint.  If  not,  go  on  to 
the  next  candidate.  If  the  association  is  feasible,  add  the  node 
to  the  association  list,  and  add  its  weight  to  the  running  sum. 

If  the  sum  is  equal  to  the  capacity,  no  further  associations  are 
possible,  and  the  evaluation  process  is  commenced.  If  the  sum  is 
less  than  the  capacity,  additional  associations  may  be  made,  and  thus, 
return  to  consider  the  next  candidate. 

Part  B)  is  the  evaluation  process.  First,  the  point-to-point 
savings  obtained  by  placing  a TACOM  at  the  node  are  determined. 

This  is  simply  the  sum  of  all  the  individual  savings  found  for  the 
associated  nodes,  minus  the  cost  of  connecting  the  TACOM  to  the 
RESCOP  and  the  cost  of  the  TACOM.  Then  a relative  multipoint 
benefit  is  calculated.  This  is  a heuristic  measure  of  the  node 
as  a site  for  a TACOM  considering  the  multipoint  nature  of  the 
problem.  The  measure  is  calculated  as  the  sum  of  the  weighted 
savings  minus  an  emphasized  cost  of  connecting  the  TACOM  to  the 
RESCOP.  The  weights  serve  to  move  TACOMS  towards  the  bigger 
clusters,  and  the  emphasis  parameter,  y,  serves  to  move  TACOMc 
towards  the  RESCOP. 

After  evaluating  each  node  as  a possible  TACOM  site,  the  one 
with  the  greatest  heuristic  measure  of  benefit  is  selected  (Step 
4)  . If  the  point-to-point  savings  for  this  site  are  riot  positive, 
then  it  is  predicted  that  an  additional  TACOM  will  not  be  cost- 
effictive,  and  the  final  stage,  Step  5,  is  then  entered.  This 


2.28 


■ :t.  mill  UAW.  &•<< 


Network  Analysis  Corporation 


stopping  condition  has  its  foundation  in  the  theorem  stated  earlier 

which  characterizes  the  costs  resulting  from  the  intermediate 

steps  of  the  "Add"  algorithm  as  a convex  function  of  the  addition  of 

new  TACOHS.  If  the  selected  site  is  found  to  be  cost-effective  on  a 

point-to-point  basis,  then  the  process  of  forming  a subproblem  is 
commenced . 

First,  the  set  of  center-of-mass  nodes  associated  with  the 

selected  site  are  removed  from  further  consideration  in  the  main 

problem.  Then  a partition  element,  p,  is  formed  as  the  set  of  real 

nodes  represented  by  all  the  center-of-mass  nodes  associated  with 

the  selected  TACOM  site.  Since  a real  node  must  be  selected  for 

the  actual  TACOM  site,  a local  optimization  procedure  is  used  to 

evaluate  possible  real  node  sites.  The  set  K is  formed  as  the  k 

real  nodes  closest  to  the  center-of-mass  node  selected  as  the  TACOM 

site.  A heuristic  measure,  zi  of  relative  cost  for  each  of  these 

nodes  is  then  evaluated,  and  the  node  with  the  minimum  cost 

measure  is  selected.  Thus,  a partition  element  of  real  nodes  and 

selected  TACOM  site  has  been  formed  as  a subproblem  for  the  line- 

layout  algorithm.  If  additional  center-of-mass  nodes  remain,  the 

next  iteration  of  the  add  algorithm  is  commenced,  otherwise,  we 
are  done. 

When  center-of-mass  nodes  still  exist  in  the  consideration  set 
A, but  TACOMS  are  predicted  to  not  be  cost-effective,  then  the  remain- 
ing nodes  are  associated  with  the  RESCOP  site.  The  line-layout  for 
the  real  nodes  represented  by  these  center-of-mass  nodes  is  the  only 
remaining  problem,  and  is  handled  in  Step  5. 

We  note  that  in  step  2 all  nodes  which  have  their  nearest 
neighbor  distance  more  than  the  allowed  maximum  are  excluded  from 
urther  consideration  by  deleting  their  distance  from  the  set  D. 

In  fact,  the  merging  of  two  nodes  may  eventually  lead  to  a new 
nearest  neighbor  with  acceptable  distance,  as  shown  in  Figure  9 
However,  this  appears  to  be  sufficiently  rare  and  inconsequential  to 
warrant  exclusion  in  favor  of  reducing  the  computational  burden.  Por 
completeness,  an  alternate  Step  2,  which  includes  this  case,  is  shown 
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COM  ALGORITHM 


Definitions : 


A = Set  of  nodes. 

L.  = Set  of  nodes  associated  with  i e A. 

1 

x . , v ■ = Coordinates  of  node  i. 
l Ji 

d.(j)  = Cost  of  connecting  j to  i . 

w.  = Weight  assigned  to  node  i. 

Parameters : 

Y - Used  to  emphasize  proximity  of  possible  TACOM  site 
to  RESCOP. 

a - Used  to  limit  distance  over  which  nodes  may  be  merged. 

k - Number  of  real  nodes  nearest  to  the  merged  node  which  are  to  be 
examined  as  possible  TACOM  sites. 


Constants : 


charge  - Cost  of  TACOM. 


w - Line  capacity, 

max 


c - TACOM  capacity 

max  r 


node  1 - RESCOP 


2.30 


•AMWIIWW !> au**  v -a  ym if J ipftUP  P v . i ■=  ■'-.'  - ',,  iw rrj  ..ry  ? :■> 


Network  Analysis  Corporation 


Constraints : 


A)  For  a line  shared  by  the  nodes  i = 1,  2,  . , . , M to  be 
feasible ; 


M 

l w • 

_T  1 


w 

max 


B)  For  a TACOM  serving  the  nodes  i = 1,  2,  . . . , M to 
be  feasible; 


M 

) w < 

i=l  1 “ 


c 

max 


Cost  Function: 


Vj) 


^xi  - xj)2  + (Yi 
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STEP  0:  (Initialisation) 


A — {i  i - 2 , 3/  • • • r N} 


DO  = 


(d.(j)  | d . ( j ) = min  d.  U)  , i = 2,  3,  . 

1 1 AeA 


, N) 


i 


for  i = 2 , 3 , . . . , N 


L.  = {i} 

l 


w . = 1 
i 


for  each  d^(j)  e DOjn^  - j 


davg  = (FT  ^0di(5> 


d = a * d 
max  avg 


D = (d. (j)  I d.(j)  £ DO,  d. (j)  < dmax) 
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STEP 


_1:  (Merge) 

d , (nj  = min  d . (n  . ) 

x.  i i 

D 11 


Form  a new  node  k with; 


(w  * x + w * x ) 

>-  *■  n „ n n 


wn  + w 


(wj,  * ^ + w n * yn  ) 

l I 


w0  + w 

i n, 


Lk  ” L£  U Ln , 


A - A - {2. , n ? } 


wk  = w2  + wn 
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STEP  2:  (Update) 


Dl  = D - (d.(n.)  n.  e {£,n„}} 
11  1 l 


A2  = {i  | i e A,  ru  e K,nf}}U{k} 


A = A U { k } 


D2  = { d i ( j ) 


i £ A2,  d.  (j)  = min  d.  (£)  } 
1 £eA  1 

w.+w.  < w 
i ■ — max 


D3  = ldi(  j)  d^j)  £ D2,  di(j)  < d ) 


For  all  i,j  such  that  d^(j)  e D3,  n^  = j 


D = Dl  U D3 

If  D / 0r  then  go  to  Step  3. 
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STKP_J:  (Evaluate  Each  Site) 


Let  Sij  = dj (1)  ‘ dj (i) 


For  each  i; 


A = A 
B.=  0 

T = 0 


A) 


If  A - 0,  then  go  to  B) 

Select  l such  that  s.  = max  s.  . 

jch  ^ 

If  s _<  0,  then  go  to  B) 


A = A - {?, } 


If  w£  + T > Cmax,  then  go  to  A) 
Bi  = Bi  u 


T = T + 


w. 


If  T " Cmax ' then  3°  to  B) 
Go  to  A) 


B^  si  .1  sij  (i)  ~ charge 


3EB. 


ri  = -jL.  Sii  * “i  ' dl(i)  * 1 
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STEP 


4 (Select  Best) 


r = max 
ieA 


r . 

1 


If  S _<  0,  then  go  to  Step  5. 


A = A - B 


P = .U_  L . 

jeB,  j 


K = {i 


n | in  e P,  n = 1,...,  k,  d£(in>  < d? (im) , im  i K} 


z = min 
ick 


* uj  + dl(i)  * V 


Do  line  layout  on  P with  i = TACOM  site. 
If  A f jS , then  go  to  Step  3. 


ELSE  STOP. 
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STEP  5:  (Finish) 


Do  line  layout  on  P with  i 


RECOP  site 
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ALTbiiNATE  STEP  2 : (Alternate  Update) 

A = A U {ki 

Dl  = { d i ( j ) | i e A,  d.(j)  = min  d.U),  vr  + w£  < wmax} 

C A 

in 

D - {d.(j)  d.(j)  e Dl,  d.(j)  < dmax] 

If  D = 0,  then  go  to  Step  3. 


Else  for  each  i,j  such  that  d^(j)  c D,  n^  - j. 


Go  to  Step  1. 
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VII.  PERFORMANCE  RESULTS 


The  COM  Algorithm  is  a heuristic  approach  to  a rathe  1.  complex 
problem.  In  order  to  evaluate  its  performance,  we  have  implemented 
the  algorithm  and  applied  it  to  a number  of  problems  with  randomly 
positioned  nodes.  The  results  of  these  experiments  are  reported 
below.  For  all  experiments,  the  implementation  was  in  Fortran, 
and  the  experiments  conducted  on  a CDC  6600  computer.  Each  of  the 
problems  were  of  the  simple  type  described  earlier,  with  constraints 
on  the  number  of  nodes  per  line  and  nodes  per  TACOM , and  one  preset 
RESCOP  site  to  which  all  nodes  must  be  connected,  either  through 
TACOMs  or  directly  through  their  multipoint  lines.  The  cost 
function  was  a telpak  rate  of  $.50  per  mile  and  $40  per  drop  as  a 
monthly  charge.  In  each  experiment,  the  values  chosen  for  the 
program  parameters  were  held  fixed  for  all  cases;  no  attempt  was 
made  at  "fine  tuning". 

In  order  to  have  a comparison  basis  for  the  evaluation,  we 
implemented  the  best  other  heuristic  algorithm  for  this  problem 
that  we  could  find  in  the  literature.  This  was  the  approach  of 
converting  the  problem  to  a point-to-point  problem  by  forming  a 
cost  matrix  based  on' the  average  of  the  cost  of  connecting  two 
nodes  through  the  minimum  spanning  tree  and  the  cost  of  connecting 
them  directly,  as  reported  by  Woo  and  Tang  [27].  The  implementa- 
tion of  this  algorithm  (which  we  call  the  Average  Tree-Direct  (ATD) 
algorithm),  was  also  in  Fortran  on  a CDC  6600,  and  used  the  same 
line  layout  procedure  used  in  the  COM  implementation.  In  order  to 
ensure  that  our  implementation  of  this  algorithm  was  reasonable  for 
a comparison  base,  we  applied  the  algorithm  to  a problem  of  four 
hundred  nodes  distributed  in  a random  manner  based  on  population 
densities  (the  distribution  technique  will  be  explained  in  detail 
later) . With  TACOMs  having  a cost  of  $25000  per  month  and  a capacity 
of  100  nodes,  and  the  RESCOP  site  located  in  Atlanta,  the  resulting 
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design  is  shown  in  Figure  10.  We  feel  the  problem  is  quite  comparable 
to  the  example  Woo  and  Tang  reported.  The  running  time  on  the  CDC 
6600  was  approximately  160  seconds,  whereas  the  time  they  reported 
for  running  on  a 360/91  was  approximately  60  seconds.  However,  in 
our  implementation  we  used  a true  combination  of  weighted  tree 
distance  and  direct  distance,  whereas  in  their  example  it  appears 
the  combination  was  approximated  with  only  a weighted  direct 
distance.  Given  also  that  their  design  resulted  in  only  three 
TACOMs , and  our 1 s four,  thus  requiring  more  time  in  the  add  algorithm 
phase,  and  the  relative  computing  power  of  the  two  machines  for  such 
problems  being  slightly  more  than  2:1  (in  favor  of  the  91),  the 
implementation  appears  reasonably  comparable  In  the  ATD  algorithm 
the  cost  of  connecting  two  nodes  i and  j is  approximated  as 

cost . . + at . . + 3d . 
i]  13  i] 

where  d.. ..  is  the  direct  connection  cost,  t ^ is  the  tree  cost,  and 
a and  6 are  parameters  with  0 - a,  3 - 1.  The  tree  cost  is  defined 
as  follows; 


Let  ^h^  be  t*1S  SSt  lin^s  (or  branches)  in  the  minimal 

spanning  tree;-  for  every  b,  define  B.  (b,  ) as  the  set  of  nodes 

n j h 

disconnected  from  j upon  removal  of  b^  from  the  minimal  spanning 
tree,  and  let  | B j (b^)  | be  the  number  of  such  nodes.  Then 


E 7T  . 

1 


where  d^  is  the  cost  of  link  b^,  and  tt^  is  the  unique  path  from 
j to  i consisting  of  a sequence  of  branches  b, . 

K 


The  values  for  the  parameters  a and  3 used  in  our  implementation  were 
determined  by  optimizing  for  a pilot  problem,  and  then  held  fixed 
during  all  the  experiments.  Our  experience  with  the  pilot  problem 
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supported  the  observation  by  Woo  and  Tang  that  the  design  results 
are  rather  insensitive  to  variations  in  the  parameters  around  the 
optimum  values.  We  now  present  descriptions  and  results  of  the 
evaluation  experiments. 

A.  Uniformly  Randomly  Distributed  Nodes 

The  first  series  of  experiments  were  performed  on  problems 
where  the  noaes  were  uniformly  randomly  distributed  over  a 2000  by 
3000  mile  rectangle.  The  algorithm  was  applied  to  problems  of  50, 
100,  200,  and  400  nodes.  The  node  distributions  for  these  problems 
are  shown  in  Figures  11,  12,  13  and  14.  The  minimum  spanning  trees 
are  shown  connecting  the  nodes.  Three  RESCOP  sites  were  considered, 
labeled  as  1,  2,  and  3 in  the  figures. 

The  first  experiment  used  constraints  of  four  nodes  per 
line  and  twenty  nodes  per  TACOM;  with  a TACOM  cost  of  $200  per  month. 
Problems  of  50,  100,  and  200  nodes  were  considered,  with  the  cost 
results  shown  in  Table  1.  In  eight  out  of  the  nine  comparisons, 
the  COM  algorithm  had  the  lower  cost,  with  an  average  improvement  of 
3.1%  ( iATD-DOM ) ATD) . A typical  COM  design  is  shown  in  Figure  13. 

In  the  second  experiment,  the  same  nine  problems  were 
considered,  but  with  a TACOM  cost  of  $1500/month.  The  results  are 
shown  in  Table  2.  The  COM  algorithm  produced  tht-:  lower  cost 
designs  in  the  same  eight  out  of  nine  cases  as  before,  with  an 
average  improvement  of  3.8%.  A typical  COM  design  for  this  experi- 
ment is  shown  in  Figure  14.  The  same  problem  is  shown  as  before; 
but  note  the  reduction  in  number  of  TACOMS  due  to  the  increase  in 
their  cost. 

In  the  third  experiment,  the  constraints  were  changed  to 
10  nodes  per  line  and  50  nodes  per  TACOM.  The  TACOM  cost  was  fixed 
at  $1500,  and  problems  of  100,  200,  and  400  nodes  were  considered. 

The  results  are  shown  in  Table  3.  In  all  cases  the  COM  algorithm 
produces  the  lower  cost  designs,  with  an  average  improvement  of 
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7.2%.  A typical  design  is  shown  in  Figure  15.  The  average  costs 
of  the  designs  produced  by  the  COM  algorithm  as  a function  of  the 
number  of  nodes  is  shown  in  Figure  18.  Note  that,  as  would  be 
expected,  the  larger  capacity  constraints  not  only  give  lower  cost 
designs, but  also  a less  rapidly  growing  cost-curve. 


B-  Randomly  Distributed  Based  on  Population 

In  most  real  problems,  the  nodes  will  not  be  uniformly 
distributed  over  a nice  rectangular  region.  In  order  to  pose  a 
more  realistic  problem,  nodes  were  located  throughout  the  United 
States  m a random  manner  based  on  population  density.  A weight  of 
1000  was  divided  among  the  238  most  populated  cities  (i.e.,  those 
with  populations  greater  than  50,000)  in  proportion  to  their  popu- 
lation. A rectangular  region  was  determined  for  each  city,  or 
collection  of  cities,  to  reflect  the  feasibility  of  the  region  to 
support  a population  segment  with  access  to  urban  facilities.  Thus, 
consideration  was  given  to  natural  geographical  boundaries,  such  as' 
mountains,  lakes  and  coast  lines,  to  major  roads  in  the  area,  to 
the  number  of  nearby  smaller  communities,  and  to  the  natural  pattern 
of  urbanization  between  relatively  close  major  population  centers. 
Using  this  approach,  123  regions  were  defined,  with  varying  sizes  of 
approximately  70  miles  square. 

Once  a number  of  nodes  has  been  allocated  to  a region  in 
proportion  to  population,  the  geographic  positions  of  the  nodes 
within  the  region  are  uniformly  randomly  distributed.  With  a large 
number  of  nodes,  it  is  reasonable  to  anticipate  that  some  of  them 
may  be  located  at  points  with  no  discernable  geographic  significance. 
Therefore,  a fraction  of  the  nodes  were  located  at  random  in  a 
large  geographic  segment;  east  of  Denver,  west  of  Pittsburgh,  north 
of  Austin,  and  south  of  Mikwaukee.  In  the  two  experiments  reported 
below,  problems  of  400  nodes  were  considered,  with  5%  distributed 
in  the  large  region. 
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In  each  experiment,  the  RESCOP  site  was  A lanta , the  TELCOM 
cost  was  $2500  per  month,  and  the  line  constraint  was  ten  nodes  per 
line.  With  a TACOM  constraint  of  100  nodes  per  TACOM,  the  ATD 
algorithm  produced  the  design  shown  in  Figure  10,  with  a cost  of 
$41,  508.50/month,  and  the  COM  algorithm  produced  the  design  shown 
in  Figure  19,  with  a cost  of  $ 39 , 418 . 50/month.  In  this  case  the 
COM  algorithm  produced  a design  lower  in  cost  by  5%. 

As  a final  experiment,  we  app>lied  the  algorithms  to  the 

same  problems  as  described  above,  but  with  no  constraint  on  the  TACOM 

capacities.  The  designs  they  produced  are  shown  in  Figures  20  and 

21.  As  can  be  seen  from  the  figures,  the  designs  are  very  similar. 

The  COM  algorithm  produced  a lower  cost  design  by  only  .50%.  This 

result,  coupled  with  the  others,  suggests  that  the  COM  algorithm  is 

perhaps  more  sensitive  to  TACOM  capacity.  In  fact,  in  the  27 

comparison  cases  examined,  the  COM  algorithm  had  fewer  TACOMS  in 

18  cases,  and  the  same  number  in  the  remaining  nine  cases.  On  the 

average,  it  used  21%  fewer  TACOMS  [ (N  m -N  ) /N  1 

ATD  COM ' ' ATD J * 

C . Computation  Time 


A significant  factor  for  a design  algorithm  is  its 
efficiency,  i.e.,  the  computer  time  it  requires.  In  order  to 
appraise  this  attribute  of  the  COM  algorithm,  the  execution  time 
for  the  algorithm  was  measured  on  several  problems.  The  times  are 
only  for  the  TACOM  selection  portion  of  the  problem,  and  not  the 
line  layout  portion.  For  comparison  purposes,  the  execution  times 
ror  the  ATD  algorithm  were  also  measured  in  the  same  way.  The  same 
basic  strategies  for  efficiency  were  used  in  each  implementation. 
For  the  problems  with  constraints  of  10  nodes  per  line  and  50  nodes 
per  1ACOM , and  TACOM  cost  of  $1500  per  month,  the  average  results 
are  shown  m Table  4.  Curves  portraying  these  results  are  shown  in 
Figure  22.  It  would  appear  that  the  COM  algorithm  is  substantially 
more  efficient.  To  quantify  this  comparison,  the  two  curves  are 
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shown  on  a log-log  scale  in  Figure  23.  From  these  curves,  the 
execution  time  of  the  COM  algorithm  for  these  problems  may  be 
approximated  by  the  function  t = (2xlO~4)N2,  and  the  ATD  al  ^rithm 
by  the  function  t = (10-4)N2*^. 

The  execution  time  was  measured  in  the  same  way  for  the 
problems  using  the  population  model.  With  a constraint  of  100  nodes 
per  TACOM , and  TACOM  cost  of  $2500  per  month,  each  algorithm  used 
fewer  TACOMs  in  its  design  than  in  the  above  uniformly  distributed 
problems,  and  the  execution  time  dropped  accordingly.  Thus,  for 
the  COM  algorithm,  the  time  was  32.8  seconds,  and  for  the  ATD 
algorithm  the  time  was  161.8  seconds.  With  no  constraints  on  the 
TACOMs,  again  fewer  TACOMs  were  found,  and  again  the  times  were 
reduced,  to  32.0  for  the  COM  algorithm,  and  121.6  for  the  ATD 
algorithm.  Note  that  the  decrease  in  the  COM  time  is  much  less 
dramatic  than  for  the  ATD  algorithm.  This  can  easily  be  inter- 
preted as  due  to  the  time  spent  in  the  add  algorithm  phase.  With 
its  simplification  of  the  problem  to  a reduced  number  of  nodes,  the 
COM  algorithm  spends  much  less  of  its  time  in  the  add  phase  than 
aoes  the  ATD  algorithm,  which  considers  all  the  nodes  in  this 
phase.  Thus  reduction  of  this  part  of  the  problem  with  greater 
TACOM  capacity  will  have  a much  more  dramatic  effect  on  the  ATD 
algorithm  than  on  the  COM  algorithm.  Note  that  this  last  case  with 
the  add  phase  used  to  select  only  two  TACOMs  will  be  one  of  the 
worst  comparative  cases  for  the  COM  algorithm. 

The  ba.1  ic  execution  time  advantage  for  t’  )M  algorithm  results 
from  its  problem  simplification  strategy.  The  merging  process  can 
be  implemented  with  only  slightly  more  complexity  than  a basic 
kruskal  minimum  spanning  tree  algorithm.  Howerver,  the  ATD  algorithm 
involves  considerable  computation  to  determine  the  equivalent  tree 
cost  of  connecting  each  pair  of  nodes  in  addition  to  generation  of 
a minimum  spanning  time.  Furthermore,  as  noted  above,  the  results 
of  the  simplification  is  not  only  conversion  to  a point-to-point 
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problem,  as  is  true  for  both  algorithms,  but  also  reduction  of  the 
number  of  nodes  to  be  used  in  the  "Add"  phase  for  the  COM  algorithm, 
whereas  the  ATD  algorithm  has  no  such  reduction. 
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4 nodes  / line 
20  nodes  / TACOM 
$200/TACOM 
COST  $28 909/month 


FIGURE  15  ; TYPICAL  COM  DESIGN  EXPERIMENT  # 1;  100  NODES 
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10  Node/line 
50  Nodes/TACOM 
$1500/TACOM 

Design  Cost  $55406/month 


FIGURE  17  : TYPICAL  COM  DESIGN 

" EXPERIMENT  #3;  400  NODES 


50  nodes/TACOM 


NUMBER  OF  NODES 


50  101  200  1000 

NUMBER  OF  NODES 


EXPERIMENT  t 1 
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4 nodos/lmo 
20  nodes/TACOM 
S200/TACOM 


ALGORITHM 

RFSCOP 

NUMBER 

OF  NODES 

(COST  IN  DOLLARS) 

50 

100 

200 

COM 

I 

12754 

20348 

34353 

ATD 

13409 

21332 

34443 

COM 

1 3373 

20809 

37411 

ATD 

2 

13215 

22745 

38011 

COM 

12885 

21371 

36845 

ATD 

3 

13503 

22000 

37556 

Average  % 
Improvement 

2.8% 

• 

| 

.0  ^ 

J l n 

■ 

1.24% 

TABLE  1: 


PERFORMANCE  OF  COM  ALGORITHM 
EXPERIMENT  # 1 . 
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EXPERIMENT  #2 

4 nodes/line 
20  nodes/TACOM 
S15CO/TACOM 


ALGORITHM 

RESCOP 

NUMBER 

OF  NODES 

(COST  IN  DOLLARS) 

50 

100 

200 

COM 

1 

14005 

25846 

44275 

ATD 

15335 

25955 

45794 

COM 

? 

16136 

25888 

48005 

ATD 

15757 

27366 

48160 

COM 

3 

15485 

26601 

47337 

ATD 

17403 

27205 

47821 

Average  % 
Improvement 

5.9% 

3.9% 

1.55% 

TABLE  2'  PERFORMANCE  OF  COM  ALGORITHM 
EXPERIMENT  # 2. 
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10  nodos/lino 
SO  nodcs/TACOM 
$ 1 500/TACOM 


ALGORITHM 

RESCOP 

NUMBER 

OF  NODES 

(COST  IN  DOLLARS) 

100 

200 

400 

COM 

1 

18393 

32023 

55406 

ATD 

20881 

33258 

57438 

COM 

2 

18765 

32107 

58520 

ATD 

21419 

33816 

COM 

3 

18706 

32184 

5696 

ATD 

22148 

34046 

Average  % 
Improvement 

13.  3% 

4 . 7% 

3.5% 

TABLE  3.  PERFORMANCE  OF  COM  ALGORITHM 
EXPERIMENT  # 3. 


10  nodes/line 
50  nodes/TACOM 
TACOM  COST:  $15 00/Month 
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ALGORITHM NUMBER  OF  NODES  (EXECUTION  TIME-SECONDS ) 


100 

200 

400 

COM 

2.1 

CM 

• 

00 

36 

ATD 

9.4 

67.7 

275 

TABLE  4:  AVERAGE  EXECUTION  TIMES  FOR  ALGORITHMS . 
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VIII.  GENERALIZATIONS  AND  EXTENSIONS 

The  TACOM  location  problem  has  been  posed  in  its  most  basic 
form,  and  an  algorithm,  with  associated  performance  results,  has 
been  presented  which  appears  to  be  an  effective  approach  to  the 
problem.  Several  generalizations  and  extensions  of  the  problem  and 

algorithm  are  now  considered. 

A.  Line  Constraints 


The  line  constraint  used  in  the  basic  formulation  was 
simply  a limit  on  the  number  of  nodes  which  may  share  a common  line. 
This  constraint  is  quite  realistic  when  thef  traffic  is  uniformly 
distributed  among  all  the  nodes.  The  appropriate  maximum  number  to 
ensure  acceptable  performance  can  usually  be  determined  auite  easily 
by  either  analytical  or  simulation  techniques.  However,  when  the 
traffic  level  for  different  terminals  varies  considerably,  a simple 
number  constraint  may  not  be  appropriate.  When  based  on  the  average 
traffic,  the  constraint  may  yield  cases  of  unacceptable  performance, 
and  when  based  on  the  maximum  traffic,  it  will  be  inefficient.  A 
constraint  we  have  found  effective  for  such  problems  is  a weighted 
sum  of  traffic  and  number  of  terminals.  In  this  case,  let  w.  be  the 
average  traffic  associated  with  node  i,  and  m be  the  number  of  ter- 
minals sharing  the  same  line.  The  constraint  has  the  form: 


k 


1 


L 

i 


w . 

1 


+ 


m 


< w 

— max 


The  proper  values  for  the  constants  k.  , k«,  and  w will  depend  on 

i l max  - 

the  performance  requirements  and  traffic  measure  used,  and  may  be 
determined  by  analytical  or  simulation  techniques. 
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To  incorporate  such  a constraint  in  the  COM  algorithm 
requires  only  a simple  modification;  two  numbers  are  kept  for  each 
node  instead  of  one.  The  first  number  is  the  combined  traffic,  and 
the  second,  the  number  of  original  nodes  represented  by  the  node. 
The  feasibility  constraint  is  evaluated  on  the  basis  of  both  these 
numbers,  in  accordance  with  the  above  formula.  The  center-of-mass 
is  determined  only  on  the  basis  of  the  second  number,  i.e.,  the 
number  of  original  nodes  represented  by  the  node.  This  is  the  same 
procedure  as  used  in  the  initial  statement  of  the  algorithm,  and 
reflects  the  intention  for  the  clustering  process  to  identify 
natural  geographical  groupings  of  nodes. 

In  general,  the  nodes  may  each  possess  a number  of  attri- 
butes which  must  be  consistent  with  a set  of  constraints  ::or  feasi- 
bility.  This  can  be  formulated  as  each  node  having  an  attribute 
vector,  v^,  and  for  feasible  merging,  a constraint  function  must  be 
satisfied. 

C (V. , V - j 1 i can  be  merged  with  j 

^ / 0 i can  not  be  merged  with  j 

This  general  formulation  can  be  handled  in  the  algorithm  in  the 
same  manner  as  described  above. 

B.  TACOM  Constraints 


The  TACOM  constraint  used  in  the  initial  algorithm  state- 
ment was  a limit  on  the  number  of  nodes  which  could  be  served  by 
a TACOM.  This  constraint  may  be  interpreted  in  terms  of  a traffic 
performance  relationship.  Acceptable  performance  on  the  part  of  the 
TACOM  may  be  directly  related  to  the  amount  of  traffic  it  must  process, 
and  if  the  traffic  is  uniformly  distributed  among  the  nodes,  then 
this  translates  into  a constraint  on  the  number  of  nodes  a TACOM  may 
serve.  However,  when  traffic  is  not  uniformly  distributed,  it  may 
be  more  appropriate  to  constrain  the  amount  of  traffic  rather  t.ian 
the  number  of  nodes.  In  fact  a weighted  sum  of  traffic  and  number 
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of  nodes,  as  in  the  line  constraint  case,  is  often  the  most 
realistic,  as  there  may  be  an  overhead  dependent  on  number  of 
nodes,  and  a processing  load  dependent  on  traffic.  The  algorithm 
can  easily  be  modified  to  handle  such  a constraint.  As  in  the 
line  constraint  case,  two  numbers  are  kept  for  each  node?  the 
first  number  is  the  combined  traffic,  and  the  second  the  number 
of  original  nodes  represented  by  the  node.  During  the  "Add" 
phase  of  the  algorithm,  the  TACOM  feasibility  constraint  is 
evaluated  on  the  basis  of  a weighted  running  sum  of  both  numbers. 

The  constraint  outlined  above  appears  reasonable  in 
form.  However,  to  obtain  appropriate  constants  for  this  con- 
straint mu'  be  considerably  more  difficult  than  obtaining  such 
constants  for  a similar  line  constraint,  as  the  TACOM' s performance 
may  be  intimately  related  to  its  particular  hardware  and/or  soft- 
ware characteristics.  Consequently,  such  constraints  are  usually 
developed  with  conservative  considerations,  and  are  thus  not 
necessarily  inviolate?  that  is,  acceptable  performance  can  often 
be  achieved  even  though  the  constraint  is  violated.  The  impact 
of  this  consideration  on  the  algorithm  stems  from  the  following 
observation:  the  algorithm  creates  subproblems  for  the  line 

layout  process  by  partitioning  the  nodes  into  subsets,  each  of 
which  satisfies  the  TACOM  constraint.  If  the  line  layout  process 
were  applied  to  the  problem  as  a whole,  with  the  derived  TACOM 
sites,  it  may  achieve  a more  economical  result  than  if  applied 
to  each  subset  independently,  simply  because  it  has  a larger 
domain  over  which  to  "optimize."  However,  unless  a line  layout 
procedure  is  used  that  is  sensitive  to  the  TACOM  constraint,  this 
may  result  in  cases  where  the  TACOM  constraint  may  be  violated. 

As  noted  above,  this  may  be  acceptable.  For  cases  where  this  is 
acceptable,  the  algorithm  can  be  easily  modified  to  simply  perform  the 
line  layout  procedure  after  all  TACOM  sites  have  been  selected. 


2.67 


Network  Analysis  Corporation 


In  some  cases,  the  TACOM  is  more  restricted  by  its 
hardware  line-connection  limitations  than  by  its  traffic  capacity. 
In  this  case,  a constraint  on  the  number  of  lines  that  may  be 
connected  to  a TACOM  is  appropriate.  Such  a constraint  should  be 
incorporated  in  both  the  "Add"  phase  of  the  algorithm,  by  simply 
limiting  the  number  of  clusters  which  may  be  connected  to  the 
TACOM,  and  during  the  line  layout  phase.  This  constraint  might 
reasonably  be  present  along  with  a traffic  constraint. 

C.  Possible  TACOM  Sites 

In  the  general  formulation  of  the  TACOM  location  prob- 
lem, the  set  of  nodes,  A,  and  the  set  of  possible  TACOM  sites,  H, 
may  be  considered  as  independently  defined.  It  is  quite  feasible 
for  situations  to  occur  where  the  possible  TACOM  sites  are  in  fact 
disjoint  from  the  nodes,  partially  overlap  with  the  nodes,  are  a 
proper  subset  of  the  nodes,  are  the  same  as  the  nodes,  or  have  the 
nodes  as  a proper  subset.  For  simplicity,  the  COM  algorithm  was 
presented  in  terms  of  the  case  where  the  possible  TACOM  sites  were 
the  same  as  the  nodes,  thus  having  to  deal  with  only  one  set,  A. 

We  will  now  consider  the  other  cases.  In  particular,  three  varia- 
tions of  the  algorithm  are  presented,  each  of  which  appears  most 
appropriate  for  a particular  type  of  possible  TACOM  locations  set. 

1.  Small  Number  of  Possible  TACOM  Sites 

Consider  the  case  where  the  number  of 
possible  TACOM  sites  is  much  smaller  than  the 
number  of  nodes.  The  possible  TACOM  sites  may 
be  disjoint  from  the  nodes,  a proper  subset 
of  the  nodes,  or  partially  overlap  the  nodes 
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(i.e.  some  possible  TACOM  sites  are  nodes, 
others  are  not).  The  portion  of  the  algorithm 
where  nodes  are  merged  into  single  COM  nodes 
representing  clusters  should  remain  the  same. 

He  ’ever,  during  the  "Add"  phase,  rather  than 
evduating  the  COM  nodes  as  possible  TACOM 
sites,  the  actual  possible  sites  (members  of 
H)  should  be  evaluated.  In  this  way  the  "Add" 
phase  will  involve  a reduced  number  of  nodes, 
and  will  directly  result  in  an  actual  TACOM 
site  selection,  thereby,  eliminating  the 
requirement  for  the  local  optimization  phase. 

Thus,  this  variation  capitalizes  on  the  small 
number  of  possible  TACOM  sites  to  improve 
computational  efficiency. 

2*  Large  Number  of  Possible  TACOM  Sites 

is  feasible  for  the  number  of  possible 
TACOM  sites  to  be  comparable,  or  even  much  larger 
than  the  number  of  nodes.  For  example,  if  a 
ma^or  oil  company  wanted  to  extend  its  corporate 
domain  into  the  commercial  time-sharing  world, 
the  initial  number  of  its  customers  may  be  far 
less  than  the  number  of  its  service  stations 
that  may  be  considered  as  possible  TACOM  sites. 

In  the  general  form  of  this  case,  the  possible 
TACOM  sites  may  be  disjoint  from  the  nodes, 
contain  the  nodes  as  a subset,  or  partially 
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overlap  the  nodes.  The  handling  of  this 
case  is  particularly  simple.  The  only 
portion  of  the  COM  algorithm  to  be  varied 
is  the  local  optimization  phase.  There, 
rather  than  selecting  an  actual  TACOM  site 
from  the  k nodes  in  A closest  tc  "he  COM 
site  selected  in  the  "Add"  phase,  the 
selection  is  made  from  the  k nodes  in  H 
(the  set  of  possible  TACOM  sites)  closest 
to  the  COM  site  selected  in  the  "Add" 
phases . 

3.  Possible  TACOM  Sites  Subset  of  Nodes 


In  the  most  common  practical  problems, 
the  possible  TACOM  sites  are  limited  to  a 
subset  of  the  nodes  selected  on  considera- 
tions of  maintenance,  rental  space,  access 
by  trained  company  personnel,  security,  etc. 

If  the  subset  is  sufficiently  small,  this 
case  should  be  handled  as  in  section  one 
above.  Otherwise,  minor  modifications  of 
the  COM  algorithm  are  appropriate.  First, 
in  the  merge  phase,  COM  nodes  which  represent 
at  least  one  real  node  that  is  cilso  a possible 
TACOM  site  should  be  flagged.  Then,  during 
the  "Add"  phase,  only  the  flagged  COM  nodes 
are  considered  as  possible  TACOM  sites. 
Finally,  as  in  section  two  above,  in  the 
local  optimization  phase  the  k nodes  closest 
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to  the  COM  node  selected  in  the  "Add"  phase 
should  be  drawn  from  the  set  of  possible  TACOM 
sites  rather  than  the  set  of  all  nodes.  These 
modifications  result,  not  only  in  satisfying 
the  design  constraint  of  TACOM  sites  being 
selected  from  the  restricted  set  of  nodes, 
but  also  improve  efficiency  by  restricting 
attention  during  the  "Add"  phase  to  a reduced 
set  of  COM  nodes  (i.e.  only  those  representing 
nodes  which  are  also  possible  TACOM  sites). 

D . Multiple  Capacity  TACOMS 

There  may  be  several  different  models  of  TACOMS  available 
with  different  capacities  and  costs.  In  this  case,  the  total  design 
problem  requires  selection  of  model  as  well  as  site  and  associated 
nodes.  There  are  several  possible  approaches  to  this  problem.  We 
mention  only  one  that  is  particularly  simple,  yet  effective. 

During  the  "Add"  phase  of  the  algorithm,  each  model  of 
TACOM  is  evaluated  at  each  site.  This  requires  little  increase  in 
computational  burden  if  the  smallest  capacity  model  is  evaluated 

as  its  stopping  point  can  then  be  used  as  the  starting  point 
for  the  next  larger  capacity  model.  The  best  performance  result  is 
then  selected,  including  model.  The  process  continues  as  usual  until 
no  further  savings  can  be  found. 

E.  Staging 

Staging  refers  to  the  interconnection  of  TACOMS  as  shown 
in  Figure  24.  A smaller  capacity  TACOM  is  connected  to  a larger 
capacity  TACOM  which  is  then  connected  to  the  RESCOP.  There  are 
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many  possible  approaches  to  this  problem.  Perhaps  the  simplest  is 
to  locate  the  large  TACOMS  first  in  the  usual  manner,  and  then  to 
consider  these  RESCOPS  for  the  problem  of  locating  the  small  TACOMS. 

F.  RESCOP  Variations 


The  basic  problem  formulation  had  a single  RESCOP  to  which 
any  number  of  nodes  and-or  TACOMS  could  be  connected.  There  are 
many  other  formulations  having  different  RESCOP  characteristics. 
Several  of  these  are  considered  below. 

^ * Constraint  on  Nodes  Connected  Directly  to  RESCOP 

A frequent  practical  problem  is  the  connection 
of  terminals  to  a large  time-sharing  computer. 

When  the  computer  is  serving  a large  user  population, 
the  over-head  required  for  direct  connection  may 
prohibit  any  direct  connections  of  terminals  to 
the  mainframe.  In  this  case,  the  problem  Is 
formulated  in  terms  of  a RESCOP  to  which  no  nodes 
may  be  connected. 

The  COM  algorithm  is  easily  modified  to 
handle  such  a formulation.  The  cost  of  a TACOM 
is  added  to  the  cost  of  connecting  a node  to  the 
RESCOP,  and  the  termination  condition  for  the 
"Add"  phase  is  changed  from  "no  savings  achieved" 
to  "no  nodes  left." 

If  the  large  computer  has  a front-end  processor 
to  connect  to  the  TACOMS,  the  processor  may  also 
serve  as  a TACOM.  In  this  case,  the  problem  is 
formulated  as  a RESCOP  with  the  first  TACOM  site 
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preselected  to  be  co- located  with  the  RESCOP. 

The  pre-selected  TACOM  may  have  the  same  or 
different  node  and/or  line  constraints  as  the 
other  TACOMS.  This  problem  is  handled  as  a 
RESCOP  to  which  no  nodes  may  be  connected, 
but  with  the  first  TACOM  site  preselected. 

2.  Multiple  RESCOPS 

In  many  problems  there  may  be  multiple 
RESCOPS.  This  may  be  the  case  initially, 
as  in  the  problem  of  locating  TIPS  in  the 
ARPANET,  or  as  the  result  of  staging  TACOMS, 
as  discussed  earlier.  To  modify  the  algorithm 
for  this  case,  all  that  is  needed  is  to  eval- 
uate costs  on  the  basis  of  connection  to  the 
closest  RESCOP. 

3 . RESCOP  Location 

The  COM  algorithm  can  also  be  used  as 
an  approach  to  the  general  resource  distribution 
problem.  In  this  problem,  the  question  is 
where  to  place  RESCOPS  to  provide  the  most 
economical  connection  of  all  nodes  to  a RESCOP. 

The  problem  is  different  from  the  TACOM  location 
problem  in  that  there  is  no  initial  facility 
against  which  to  trade  connection  costs. 

The  approach  to  this  problem  is  very 
similar  to  the  case  of  RESCOPS  which  permit  no 
direct  terminal  connections.  Each  node  is 
assigned  an  initial  RESCOP  connection  cost, 
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which  is  simply  the  cost  of  a RESCOP.  Thus, 
in  the  "Add"  phase,  instead  of  comparing  the 
cost  of  connecting  to  a selected  site  to  the 
cost  of  connecting  to  a RESCOP,  the  comparison 
is  made  to  the  cost  of  having  a RESCOP  at  its 
own  site. 

4 . Connectivity  Requirements 

When  the  TACOM  is  intended  to  provide 
access  to  a packet  switching  subnet  (such 
as  provided  by  a TIP  in  the  ARPANET) , there 
may  be  requirements  placed  on  its  inter- 
connection to  the  subnet.  In  the  ARPANET 
case,  such  a requirement  for  a TIP  is  at 
least  two-connectivity,  and  acceptable 
impact  on  network  performance  and  reliabil- 
ity. To  evaluate  these  constraints  for 
every  site  would  be  too  costly.  Consequently, 
an  approach  is  taken  of  determining  a list  of 
the  k most  economical  arrangements  during  the 
"Add"  phase,  and  then  processing  the  list  to 
determine  the  first  feasible  arrangement.  If 
the  TACOM' s can  be  inter-connected  (such  as 
TIP's  connected  to  TIP's),  then  the  selected 
TACOM  is  considered  in  place  for  the  next 
iteration  of  the  "Add"  phase. 

G . Very  Large  Networks  and  Further  Time  Reductions 

The  COM  algorithm  may  be  viewed  as  composed  of  a "Cluster- 
ing" phase,  followed  by  an  "Add"  phase,  followed  by  a "line- layout" 
phase.  The  results  reported  above  were  obtained  with  a rather 
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straight-forward  implementation  of  the  algorithm  in  which  no 
particular  effort  was  made  to  minimize  execution  time  or  storage 
requirements.  However,  when  working  with  very  large  networks 
(5,000  - 50,000  nodes),  such  considerations  are  imperative.  We 
give  below  a brief  discussion  of  two  basic  techniques  that  can 
be  applied  to  extend  the  algorithm  for  applicability  to  very 
large  networks. 

1.  Sparsity 

In  each  phase  of  the  COM  algorithm,  various 
interconnections  of  nodes  are  considered.  In 
general,  such  interconnections  may  be  viewed  as 
branches  of  a graph.  When  all  interconnections 
are  considered  possible,  and  are  thus  examined, 
the  graph  is  a complete  graph.  However,  there 
are  many  interconnections  which  may  easily  be 
discarded  from  consideration.  This  corresponds 
to  limiting  consideration  to  a sparse  graph.  Thus, 
in  determining  closest  nearest  neighbors,  it  is 
quite  efficient  to  place  a grid  over  the  network, 
assign  each  node  to  a box,  and  then  for  each  node 
only  examine  nodes  in  the  same  box  or  adjacent 
boxes  to  determine  the  nearest  neighbor.  The 
box  assignment  can  be  made  linear,  and  the  nearest 
neighbor  search  can  then  be  considerably  reduced 
in  complexity.  Very  distant  nodes  are  naturally 
excluded  from  consideration.  Such  an  approach 
can  also  be  used  in  the  "Add"  and  line-layout 
phases.  Note  that  the  sparsity  does  not  imply 
inconsistency  or  inaccuracy  in  the  results,  i.e., 
true  nearest  neighbors  will  indeed  be  found.  How- 
ever, even  greater  efficiency  can  be  obtained  by 
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coupling  the  sparsity  with  acceptable  approxi' 
mations,  i.e.,  if  no  feasible  neighbors  are 
found  in  adjacent  boxes,  simply  abandon  the 
search,  lather  than  considering  the  next  ring 
of  boxes. 

2.  Local  Considerations 


In  very  large  networks,  it  is  often  quite 
reasonable  to  make  decisions  on  the  basis  of 
local  considerations  rather  than  global  consid 
erations.  Thus,  in  the  "clustering"  phase,  it 
would  appear  quite  reasonable  to  merge  nearest 
neighbors  on  c.  box-by-box  basis  rather  than 
choosing  the  two  closest  together  nodes  ovar 
the  entire  network. 
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FIGURE  24:  TACOM  STAGING 
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IX.  CONCLUSION 

A new  algorithm  has  been  presented  for  the  design  of  multidrop 
networks  that  may  incorporate  TACOMS  (TIP,  ANTS,  concentrator,  or 
multiplexer)  to  economically  connect  nodes  (users)  to  RESCOPS  (re- 
so  rce  connection  points).  Experiments  with  the  algorithm  indicate 
that  it  is  both  effective  and  efficient.  Extension  of  the  basic 
algorithm  to  handle  more  general  problems  was  shown  to  be  easily 
accomplished. 

Research  is  continuing  on  extending  the  concepts  reported 
here  to  the  integrated  design  of  large  (5,000  - 50,000  nodes), 
hierarchical  networks  with  various  levels  of  access  facilities. 
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A BRANCH  AND  BOUND  APPROACH  TO  TOPOLOGICAL  NETWORK  DESIGN 

PART  1 


I.  INTRODUCTION 


We  are  here  concerned  with  the  following  topological  design 
problem  for  a store-and-forward  communications  network.  We  wish 
to  design  a minimum  cost  2-connected  network  that  will  accommodate 
a given  amount  of  inter-node  traffic,  while  keeping  the  total  average 
delay  that  packets  experience  in  route  below  certain  specified 
amount.  It  is  assumed  that  all  circuits  have  the  same  capacity  C. 

The  precise  mathematical  formulation  of  the  design  problem  is  now 
presented. 

Given:  Requirement  Matrix  R 

Cost-capacity  functions  Di  = d^c),  Vi 

D(A)  = i d. (C)  where  A is  the  set  of  links 
ieA 

which  correspond  to  a given 
topology. 

(a)  f is  a multicommodity  flow  satisfying  the  reouirement 
matrix  R. 


Minimize : 
Over  A,  f 


(b)  ! 1 c 


(c) 

(d) 


T = 
The 


1 


set  A must  correspond 


T 

max 

to  a 2-connected 


topology . 
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At  present,  only  suboptiinal  techniques  are  available  for  the 
solution  of  the  design  problem.  One  such  technique  is  the  Branch- 
Exchange  (BXC)  method,  which  consists  of  improving  a given  initial 
topology  by  means  of  a sequence  of  topological  transformations, 
called  branch  exchanges  [11 . BXC  terminates  when  all  possible  trans- 
formations are  explored. 

A refinement  of  BXC  is  the  Cut-Saturation  (CS)  method  [2] , which 
also  uses  an  iterative  approach  and  performs  at  each  step  the  topo- 
logical transformation  that  is  most  likely  to  yield  cost-performance 
improvement. 

A third  technique  is  the  Concave  Branch  Elimination  method  [3]  , 
which  starts  from  a fully  connected  topology  and  eliminates  un- 
economical links,  until  a locally  optimal  configuration  is  achieved. 

It  should  be  emphasized  that  the  solution  obtained  by  employing 
the  techniques  described  above,  is  conceivably  a suboptimal  solution 
to  the  originally  posed  problem.  We  were  able  to  obtain  lower 
bounds  on  the  optimal  solution,  and  to  show  that  in  most  cases,  the 
suboptimal  cost  is  no  more  than  10  - 20%  higher  than  the  lower 
bound.  But  there  is  an  indication  that  the  suboptimal  solutions  are 
indeed  much  closer  to  optimum,  say  by  less  than  5%. 

It  is  important  therefore  that  we  endeavor  to  develop  a procedure 
that  would  yield  the  exact  solution,  in  order  to  be  able  to  evaluate 
the  efficiency  of  the  suboptimal  techniques,  and  to  determine  whether 
it  would  pay  to  improve  them.  If  the  exact  solution  proves  to  be 
elusive,  we  at  least  hope  to  determine  tighter  lower  bounds  on  the 
total  cost  of  an  optimally  designed  network.  The  lower  bounds  can 
be  employed  along  with  the  upper  bounds  corresponding  to  current 
heuristic  solutions,  to  effectively  trap  the  optimal  solution  and 
reinforce  the  conviction  that  our  heuristic  solution  is  in  fact  a very 
good  one . 

In  the  sequel  we  describe  a branch  and  bound  algorithm  for  the 
exact  solution  of  the  topological  problem.  The  algorithm  is  parti- 
cularly attractive  because,  in  addition  to  providing  at  the  end  the 
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P i»  solution,  it  provides  also  a lower  bound  which  is  increasing 
' therefore  becoming  more  precise)  at  each  step  of  the  computation, 
he  purpose  of  this  section  is  to  describe  the  algorithm, 

convergence  to  the  optimum  solution,  and  discuss  its  possible 
applications.  The  coding  of  the  algorithm  is  now  in  progress;  experi- 
mental results  will  be  available  in  the  near  future.  Such  results 

Will  allow  evaluation  of  the  computational  efficiency  of  both  exact 
method  and  heuristic  techniques. 
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II.  THE  LOWER  BOUND 


We  propose  to  employ  a modified  branch  and  bound  technique  to 
effectively  construct  an  optimal  topological  configuration  by  an 
exhaustive  display  of  a proper  subset  of  the  set  of  all  possible 
topological  configurations. 

A very  important  ingredient  of  every  branch  and  bound  algorithm 
is  the  determination  of  the  lower  bound.  Lower  bounds  are  generally 
obtained  by  solving  subproblems  which  are  in  a sense  more  favorable 
than  the  original  problem,  either  because  some  of  the  constraints 
are  relaxed,  or  because  the  real  costs  are  replaced  by  lower  approxi- 
mations . 

In  our  case,  the  subproblem  consists  of  determining  the  minimum 
cost  routing  on  a partially  specified  topology.  A partially  specified 
topology  contains:  a set  SA  of  assigned  links,  a set  Sy  of  undefined 

links;  and  a set  b„  of  excluded  links.  The  union  of  the  three  sets 

E ! 

S = SA  U £u  U Sn  is  the  set  of  all  possible  links  (which  are  ^NN(NN-l) 

in  a network  with  NN  nodes). 

The  assigned  links  are  links  that  have  been  definitively  intro- 
duced in  the  network  topology  and  therefore  their  cost  is  the  real  cost 
corresponding  to  leasing  a circuit  of  given  capacity  between  the  end 
points.  Although  the  value  of  capacity  on  the  assigned  links  is  fixed, 
we  can  represent  the  cost  of  such  links  as  a function  of  capacity,  as 
shown  in  Figure  1.  The  curve  in  Figure  1 indicates  that  the  only  ad- 
missible value  of  capacity  for  an  assigned  link  is  C,  since  there  is 
no  saving  for  C < C,  and  the  cost  becomes  infinite  for  C > C. 

The  undefined  links  are  links  for  which  it  has  not  yet  been 
decided  whether  they  should  be  included  or  excluded  from  the  topology. 

Due  to  this  uncertainty,  their  cost  should  be  somehow  proportional 
to  the  link  utilization,  and  should  be  zero  if  the  link  carries  zero 
flow.  Therefore,  the  cost  of  an  undefined  link  is  assumed  linear  with 
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respect  to  link  capacity,  and  is  shown  in  Figure  2.  Notice  that  the 
cost  of  the  undefined  link  equals  the  real  cost  for  C = C.  Therefore, 
the  undefined  link  cost  is  a lower  bound  to  the  real  cost  shown  in 
Figure  1. 

The  excluded  links  are  links  that  have  been  discarded  during 
preceding  branch  and  bound  steps,  and  therefore,  are  no  longer  con- 
sidered at  this  stage. 

For  the  above  mentioned  partially  specified  topology,  we  are 
interested  in  solving  the  following  subproblem: 


Min  i 
f,C  lfSU 


d.  (C.) 
1 l 


Such 

That 


Tft(f)  < T 


MAX 


(1) 


Ty(f)  < T 


MAX 


f is  a multicommodity  flow  corresponding  to  the  require- 
ment matrix  R. 


where : 

(1)  d^(C^)  is  the  linear  cost  function  for  the  undefined 
link  i. 

1 T f • 

(2)  T(f)  = — . L — — is  the  total  delay  on  the 

A " Y 1EbA  C-f. 

l 

assigned  links. 


(3)  T^(f)  t*"ie  total  delay  on  the  undefined  links. 

(4)  Tft  + T^  is  total  delay  for  the  partially  specified 
network. 
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Let  us  consider  the  total  cost  of  the  partially  specified  topology 
Dp  given  by: 


ieS. 


d.  (C  . ) 
1 1 


u 


(2) 


It  is  clear  that  at  optimality  (i.e.,  after  solving  problem  (1)),  Dp 
represents  a lower  bound  to  the  cost  of  any  final  topology  derived 
from  the  partially  specified  one  by  assigning  a subset  of  undefined 
links.  In  fact,  in  subproblem  (1) , the  cost  of  each  undefined  link 
is  a lower  bound  to  the  real  cost  of  the  link  in  the  final  topology. 
Furthermore,  the  delay  constraint  has  been  relaxed  since  the  total 
delay  is  required  to  be  £ 2TMAX (instead  of  £ tmax' * 

In  order  to  solve  problem  (1),  we  first  express  the  cost  of  the 
undefined  links  Dy  as  a function  of  the  link  flows.  From  [4],  we 
have : 


DU  = 


l 

ieS 


di  (Cj.)  = 


u 


l 

ieS 


difi  + 


U 


yT 


MAX 


l 


/d^TT  /FT"  (3) 


This  expression  relates  Dy  to  the  link  flows  in  such  a way  that  the 


delay  constraint  _<  T 


MAX 


is  satisfied. 


Since  the  expression  in  (3)  is  concave  in  f,  and  we  prefer  a 


convex  objective  function,  we  further  bound  as  follows: 


'U 


1 I 

ieS 


d.  f . + 

l l 


yT 


U 


MA 


l 

i»  jcS 


^3” 


u 


f . f . 
1 3 


(4) 


It  can  be  easily  shown  that  the  r.h.s.  of  (4)  is  convex.  By  replacing 
(4)  in  the  original  formulation  of  the  subproblem  in  (1),  we  obtain 
a convex  multicommodity  flow  problem  whose  solution  still  p'ovides  a 
lower  bound  on  the  final  cost. 


i 
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Subproblem  (1)  can  now  be  rewritten  as  follows 


Min  r- 

f . I 


1 /d  . d . 

I difi  + yr—  l f.f. 

itsu  i'^su  c 1 3 


ta  ( f ) 


ReS.  C-f. 
A l 


< T 

- MAX 


f is  a multicommodity  flow  satisfying  requirement  matrix  R. 

Since  the  objective  function  in  (5)  is  convex  and  the  constraint 

net  is  also  convex,  there  exists  only  one  local  minimum,  which  is  also 
the  global  minimum. 

The  solution  of  (5)  can  be  carried  out  using  the  method  of  the 
Lagrange  multiplexers.  To  do  so,  we  rewrite  (5)  as  follows: 


L(f,A)  = 


/d.d. 


d.f.  + —± — y _± 

i i yt  t r 

MAX  i/jeSy  C 


Vi 


- T } 
MAXJ 


s.t.  f is  a multicommodity  flow  satisfying  requirement  matrix 


The  multiplier  X > 0 must  be  chosen  in  such  a way  that,  for  the 
optimal  flow  f*,  we  have: 


i r • 

x <7  l ~— 
Rssa  c-q 


tmax*  ' 0 


-■j 
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A possible  way  to  solve  (6)  is  to  select  an  arbitrary  A0,  min- 
imize L (f , A 0 ) over  f,  then  verify  if  Equation  (7)  is  satisfied.  If 
(7)  is  not  satisfied,  a different  A,  say  A1,  is  chosen  and  L (f r A1) 
is  minimized,  etc.  We  anticipate  that  the  search  on  A will  not  be  a 
serious  computational  bottleneck  since  o(A)  = min  (over  f)  L(f,A) 
is  monotonic.  Furthermore,  we  don't  need  to  solve  Equation  (6)  with 

great  accuracy,  but  rather  can  be  satisfied  with  a lower  bound  to  its 
optimum  solution. 

At  this  point,  the  topological  problem  for  a partially  specified 
topology  has  been  reduced  to  the  solution  of  an  unconstrained,  strictly 
convex,  multicommodity  flow  problem,  namely,  the  minimization  of 
L(f,  A)  over  f.  Such  a minimization  is  readily  carried  out  with  the 
Flow  Deviation  Method  [ 5 ] , an  efficient  tool  for  the  solution  of 
nonlinear  flow  problems. 

We  have  shown  how  we  can  find  a lower  bound  on  the  cost  of  all 
possible  topologies  deriving  from  a partially  specified  topology.  The 
next  section  shows  how  we  can  use  these  results  for  the  search  for  the 
globally  optimal  topology. 
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^ ^ ^ • THE  branch  and  bound  concept 

First,  we  describe  the  branch  operation.  Suppose  we  have  a 
partially  specified  topology  Tn  to  which  are  associated,  three  sets 
of  links  SA,  Su,  SE  defined  above.  From  such  a topology,  we  can 
"branch"  to  two  new  topologies  by  removing  a link,  say  link  i, 
(selected  with  some  well  defined  criterion)  from  S and  reassigning 
it  to  either  SA  or  S£.  The  two  new  topologies  T and  T . corres- 
pond to  the  following  sets  of  links:  n+2 


TnU  " (Sa  + {i}'  Su  " {i}'  SE) 
Tn+2  = (SA'  SU  " (il'  se  + 


It  is  clear  that  if  we  start  from  the  topology  T in  which  all 
links  are  undefined  (i.e.,  = S£  - 0) , and  proceed  branching  until 

all  the  final  topologies  have  Sy  = 0 , we  have  generated  all  the 
j NN(NN-l) 

graphs  that  can  be  constructed  on  NN  nodes. 

The  branch  and  bound  techniques  is  essentially  an  intelligent 
way  of  searching  the  tree  of  successive  derivations  and  branches, 

...  ^ i NN(NN-l) 

Without  enumerating  all  the  22  possi51e  solutions.  The 

instrument  that  helps  us  in  this  search  is  the  notion  of  lower  bound, 
discussed  in  the  previous  section. 

A typical  Step  of  a branch  and  bound  procedure  is  now  described 

Suppose  we  have  generated,  by  means  of  successive  branch  operations, 

a part  of  the  tree;  (see  Figure  3)  and  suppose  the  incomplete  tree 

has  p "leaves"  (i.e.,  terminal  topologies),  namely  T ,T  , t 

n n+1  ' n+p-1 ' 

wl*-“  respective  lower  bounds  t.r  tr  to 

dS  n'  LBn+l'-**'LBn+p-l'  computed  by  solving 
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the  associated  subproblems  (as  discussed  in  Section  2) . The  next 
branch  move  is  performed  on  topology  T such  that 

il'  O 

LBn+S  - LBi  Vi  = n»***»n+p-l. 

To  show  that  the  procedure  converges  to  the  optimum  solution, 
we  fi-rst  must  show  that  the  lower  bound  for  the  two  topologies  gen- 
erated by  a branch  operation  (successors)  is  greater  or  equal  to  the 
lower  bound  for  the  originating  topology  (predecessor) . But,  this  is 
true  since  the  two  successor  subproblems  have  more  constraints  than 
the  predecessor  subproblem,  so  the  minimum  for  the  former  is  generally 
higher  than  the  minimum  for  the  latter. 

The  branch  and  bound  procedure  terminates  when  the  topology  which 
minimizes  the  lower  bound  corresponds  to  a feasible  topology  (i.e., 
is  such  that  the  set  SA  can  satisfy  all  the  requirements  with  T < T 
and  corresponds  to  a 2-connected  topology) . Such  a topology  is  optimal 
because  its  lower  bound  corresponds  to  the  actual  cost,  and  is  lower 
or  equal  than  the  lower  bound  for  any  other  infeasible  topology,  and 
therefore,  (because  of  the  ncndecreasing  property  of  lower  bounds), 
lower  or  equal  to  the  cost  of  any  other  feasible  topology. 

To  summarize,  an  iteration  of  the  branch  and  bound  procedure  con- 
tains the  following  steps: 

A.  Determine,  between  all  the  topologies  so  far  generated, 
the  topology  which  minimizes  the  lower  bound.  If  such  a top- 
ology is  feasible,  STOP:  optimum  has  been  found. 

B.  Perform  a branch  operation  on  such  topology. 

C.  Compute  the  lower  bounds  for  the  two  successors  and  check 
for  feasibility.  Go  to  A. 


■ 
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IV.  COMPUTATIONAL  CONSIDERATIONS 


The  only  way  to  determine  the  computational  efficiency  of  a 
search  algorithm  such  as  the  branch  and  bound  method,  is  that  of 
performing  experiments  on  a reasonable  number  of  sample  problems  of 
various  size  and  complexity.  Since  experimental  results  are  not 
available  yet,  we  can  only  make  qualitative  statements. 

It  can  be  anticipated  that  the  computational  effort  required 
to  find  the  optimal  solution  will  be  considerably  larger  (by  several 
orders  of  magnitude)  than  the  effort  required  by  suboptimal  techniques. 
Therefore,  it  is  very  unlikely  that  branch  and  bound  is  a suitable 
network  design  tool. 

The  indication  that  the  method  will  be  very  time  consuming  follows 
from  the  conjecture  that  there  is  typically  a very  large  number  of 
near-°ptimal  solutions.  The  method  will  have  to  search  all  such 
solutions  before  being  able  to  declare  that  a particular  solution  is 
optimal.  Incidentally,  it  is  of  interest  to  note  that  the  presence 
of  a large  number  of  near-optimal  solutions  is  detrimental  to  the 
efficiency  of  exact  methods,  but  is  beneficial  to  the  efficiency  of 
heuristics,  and  of  locally  optimal  techniques  in  general. 

In  favor  of  branch  and  bound,  we  have  the  following  facts: 


A.  It  is  much  less  time  consuming  than  enumeration.  In  fact, 
although  there  are  a lot  of  good  topologies,  there  is  a much 
larger  number  of  very  bad  topologies  that  will  never  be  ex- 
plored by  branch  and  bound. 

B.  There  are  a variety  of  features  in  a branch  and  bound 
algorithm  that  can  be  properly  taylored,  in  order  to  obtain 
maximum  efficiency.  For  example,  one  must  properly  design 
the  selection  criterion  for  links  to  be  assigned  or  excluded, 
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based  on  cost,  utilization,  connectivity,  etc.  Also,  the 
results  of  the  solution  of  a predecessor  subproblem  could 
be  used  for  the  solution  of  successor  subproblems , with 
substantial  computational  savings.  In  addition,  one  might 
consider  other  types  of  search  (e.g. , the  depth  first  search) 
instead  of  the  steepest  descent  search  described  in  the  pre- 
vious section. 

C.  Although  the  optimal  solution  is  elusive,  it  is  likely 
that  the  bound  becomes  sufficiently  tight  much  before  we 
find  the  optimum.  In  particular,  we  hope  that  a reasonable 
computational  effort  will  reduce  the  gap  between  heuristic 
solution  and  lower  bound  from  10-20%  (as  we  have  now)  to  less 
than  5%. 
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V.  IMPLICATIONS  FOR  FUTURE  RESEARCH 

Beside  the  determination  of  globally  optimal  topologies,  the 
branch  and  bound  concept  can  be  used  for  other  applications  related 
to  topological  design.  Two  possible  applications  are  here  described. 

Suppose  we  want  to  optimally  upgrade  an  existing  network  con- 
figuration, following  an  increase  of  traffic  requirements  and  the 
installation  of  new  nodes.  Let  us  assume  that  we  are  able  to  identity 
a set  of  candidate  links  that  most  likely  include  as  a subset,  the  set 
of  links  corresponding  to  the  optimum  topological  reconfiguration.  The 

branch  and  bound  algorithm  can  be  applied  setting  S.  = set  of  original 

A . . A 

links  and  S^  = set  of  candidate  links.  If  the  optimal  solution  ap- 
pear to  be  elusive,  we  can  always  use  the  lower  bound  information  to 
control  a suboptimal  technique  such  as  the  Cut-Set  Method. 

Another  area  of  application  of  the  branch  and  bound  concept  is 
within  a suboptimal  procedure.  During  the  application  of  the  Cut- 
Saturation  Algorithm,  for  example,  [ 2],  we  are  at  each  step  confronted 
with  the  problem  of  which  new  links  to  introduce,  or  which  old  links  to 
eliminate  from  the  current  topology.  Presently,  a choice  is  made, 
without  the  ability  of  evaluating  the  performance  of  the  remaining 
alternatives.  To  avoid  this,  one  might  think  of  solving  a partially 
defined  topological  problem,  with  Sy  corresponding  to  the  set  of  links 
which  are  candidate  for  introduction  or  removal.  The  nature  of  the 
solution  (e.g.,  the  relative  utilization  of  links,  etc.)  might  offer 
valuable  insight  into  the  relative  cost  effectiveness  of  the  links, 
and  provide  guidance  in  the  selection  of  links  to  introduce  or  elimin- 
ate. 

The  branch  and  bound  algorithm  here  described  can  be  further  ex- 
tended to  the  solution  of  multiple  capacity  option  problem.  However, 
it  is  conceivable  that  the  computational  complexity  will  rapidly  in- 
crease due  to  the  large  number  of  possible  combinations. 
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In  summary,  we  have  shown  that  branch  and  bound  is  a valid 
approach  to  the  exact  solution  of  the  topological  problem;  we  have 
also  indicated  that  it  might  find  applications  in  the  solution  of 
practical  problems,  when  combined  with  heuristics  or  used  to  develop 
bounds.  We  have  to  wait  now  for  the  first  experimental  results  in 
order  to  better  define  the  area  of  applicability  of  this  theoretically 
very  attractive  technique. 
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A SYSTEM  FOR  LARGE  SCALE  NETWORK  COMPUTATIONS-PART  II 


In  order  to  support  the  many  and  varied  network  calculations 
desired  in  other  chapters  of  this  report,  a software  system  for 
large  scale  network  computations  is  being  developed.  Besides  the 
support  function,  this  system  is  designed  to  be  portable  and 
general  so  that  it  can  be  used  for  almost  all  network  based  applica- 
tions in  a distributed  computational  environment  such  as  ARPANET. 
While  the  initial  applications  of  the  system  within  NAC  have  been 
to  communication  networks,  discussions  are  in  progress  with  the 
army  to  apply  parts  of  the  system  to  applications  in  risk  analysis 
and  circ  lit  analysis.  Within  NAC  the  software  has  been  used  with: 

(1)  RELOUT:  An  interactive  program  for  analyzing  the  cost 

throughput,  delay,  and  reliability  of  packet  switched 
data  networks.  This  program  is  described  in  the  chapter. 
Impack  Of  Interactive  Graphics  On  Network  Design. 

(2)  A simulator  of  the  Packet  Radio  System;  This  application 
is  notable  because  it  is  a hybrid  batch-interactive 
program  running  on  three  computers  an  IBM  360-91,  a PDP-10, 
and  an  Imlac  PDS-1  Graphic  Display  Computer.  The  simulator 
is  described  in  the  chapter:  Simulation  of  Computer 
Communication  Networks. 

(3)  A heuristic  algorithm  for  Set  Covering  Problems  Which 

Illustrates  another  dimension  of  interactive  graphics 
used  for  network  analysis.  This  is  the  use  of  visual 
feedback  to  guide  the  design  of  heuristics.  The  heuristics 
and  their  application  to  radio  repeater  location  problems 
is  described  in  the  chapter:  Repeater  Location  Optimiza- 

tion. 
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The  software  in  the  system  consists  of  throe  parts: 

(a)  Input,  output,  and  editinq  functions, 

(b)  Data  structuring  and  mapping, 

(c)  Languages  for  coding  network  algorithms. 

Virtually  all  the  work  has  been  done  in  the  first  area.  The 
first  version  of  the  network  editor  described  in  the  Second  Semi- 
annual Report  has  been  implemented:  a simple  but  effective  windowing 
for  the  graphics  display  is  90r>  implemented:  and  general  purpose 
parameter  selection  software  has  been  implemented.  It  is  this 
software  that  has  been  used  in  the  application  programs  described 
above . 

Exploratory  work  is  beinc:  carried  out  in  the  second  area  of 
data  structure  and  mat  win  1.  The  immediate  goal  is  to  simplify 
hybrid  interactive-batch  calculations  on  the  ARPANET.  This  is  being 
done  in  conjunction  with  the  dynamic  modelling  group  at  project  NAC. 

Probably  the  best  characterization  of  the  software  is  to  show 
how  it  is  used.  To  do  this  we  first  reproduce  a scenario  showing 
the  use  of  the  TTY  version  (without  graphics)  of  RELROUT.  Following 
this  are  photographs  of  some  of  the  graphics  displays.  Figure  1(a) 
shows  the  June  1974  ARPANET.  Figures  1(b)  and  1(c)  show  two  2x  en- 
larged details  usino  the  windowing  capability.  Figures  2(a)  and 
2(b)  are  graphic  output  from  the  reliability  and  routing  programs 
respectively.  In  2(b)  throughput-delay  curves  are  superimposed  for 
three  different  line  capacities  19.2,  50,  and  230  kilibits  per 
second . 
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FIGURE  1 (C) 
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20 5 WE  ADD  A NEW  CROSS  COUNTRY  LINK 
I S I 


PP  fi  5 WE  ARE  GOING  TO  DO  A RELIABILITY 
♦♦NALYS IS WHERE  ONLY  LINKS  FAIL  SO  M 
♦♦  SET  ALL  NODE  FAILURE  PROBAEILI i I 
♦♦S  TO  ZERO 


TER 

PROPEPT 

V COMMAND  OR  ? TAM 

; me  1 

“Ljcrrix 

NODE 

Cp 

ONG  L 

.A  i 

i 

ETAC 

NES 

ABEC' 

0 . 0 11 0 0 0 0 0 
o . 0 0 0 0 0 0 0 

0 . O 0 0 0 0 it  0 

77.0  IJ  U 
77.160 
77 . 0011 

39 . 

Ct  m 

;! 0 
1 3 0 
0 0 0 

4 

P!  ITG 

n . oooooon 

74.450 

40. 

43  0 

5 

HAPV 

o . o 0 0 0 0 0 0 

71  .£50 

4£  . 

5 0 0 

k, 

NCC 

ft . o 0 0 0 0 0 0 

71  .£50 

42  . 

5 0 0 

BEN 

o . 0 0 0 0 0 0 0 

71  .£50 

4£  . 

5 0 0 

GCA 

fi . o 0 0 0 0 0 0 

71  .330 

42  . 

500 

Q 

M I T £ 

o . 00000  00 

7 1 . £ 0 0 

42  . 

5 0 0 

i n 

MIT  I 

o . 0 0 0 0 0 0 0 

71  .£00 

42. 

5 0 0 

i i 

LINE 

0 . o 0 0 o 0 0 n 

71  .330 

42  . 

53  0 

i 3 

PA  DC 

0 . 0 0 0 0 0 0 0 

75. 4£0 

43  . 

£5  fi 

i 

CASE 

I'I . o 0 0 0 0 0 0 

31  .750 

41  . 

5 0 0 

1 4 

CM'J 

0 . 0 0 0 0 0 0 0 

73.330 

40. 

5 0 0 

15 

BELV 

0 . 0 0 0 0 0 0 0 

77 . 000 

33  . 

, 03  0 

1 6 

3D  AC 

o . 0 0 0 0 0 0 0 

77 . 160 

• 

,9£0 

17 

MITRE 

0 . 0 0 0 0 0 0 0 

77.000 

39, 

, 000 

1 9 

ARP  A 

o . 0 0 0 0 0 0 0 

77.000 

39  . 

. 000 

19 

CM! 

0 . 0 0 0 0 0 0 0 

3 0.570 

DO 

.£50 

2 0 

IS  I 

0 . 0 0 0 0 0 0 0 

1 13.530 

34 

. 0 0 0 

£1 

RAND 

0 . 0 0 0 0 0 0 0 

1 13.530 

.920 

o p 

UCSD 

o . 0 0 0 0 0 0 0 

117.160 

:32 

.660 

- 

UCLA 

0 . 0 0 0 0 0 0 0 

1 1 3 . 5 £ 0 

3 4 

. U7  0 

£4 

”.Cj 

3 DC 

i |p r 

0 . 0 0 0 0 0 0 0 
ft . i'i  0 0 000  0 

1 13.550 
113.350 

34 

34 

. UXO 
.000 

DO  CD 

o . 0 0 0 0 0 0 0 

1 05 .000 

39 

.500 

•“*.ir 

i'i  . i'i  0 0 0 0 0 0 

36 . 0 0 0 

41 

. 0 0 0 

30 

WPAFB 

0 . 0 0 0 0 0 0 0 

3 4. £00 

39 

.750 

Q 

ILL  I 

0 . 0 0 0 0 0 0 0 

S3 .500 

40 

. 030 

i'i 

UTAH 

0 . 0 0 0 0 0 0 0 

1 1 1 .330 

4 0 

.660 

Si 

LEL 

i i i 

0 . 0 0 0 0 0 0 0 
n . n noon oo 

i££.£30 
1 £ 1 .750 

•J'  I 
O i 

• S3  0 
.63  0 

ts 

SR  I 

0 . 0 0 0 0 0 0 0 

1 ££ . 1 6 0 

-•  i" 

. 36  0 

3 4 

XEROX 

o . 0 0 0 0 0 0 0 

1 £ £ . 170 

37 

. 3 0 0 

A 5 

, TYMSH 

o . 0 0 0 0 0 0 0 

i£l .300 

37 

. 33  0 

••ij 

FNMC 

fi . Ij  0 0 0 0 0 0 

1£1 .320 

36 

.50  0 

37  UCSF 

o . 0 0 0 0 0 0 0 

1 19.750 

34 

.500 

IF  IT  WORKEED 
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NAC  NS 

CENARIO . J 1 

MON  3'ij-MAi 

l'  4 X i .7 

'•i  :tan 

ft  . ij  ii  ilium  ft 

I CC  ■ i i 

37  . ft  ft 

-:-r*  AMEC 

ij  . ft  ft  ij  ft  ft  ft  ij 

1 22 . i'i  ? ft 

C7 ,33ft 

4ft  MOFF 

ij  . ij  i j ft  ij  i j ft  ft 

133 . ft:?  ft 

37  . £”::  n 

■41  EBNT 

ij  . ft  ft  ij  i j ij  ft  ft 

7 i . EC ft 

43 .5 ft ft 

43  AMET 

ij . i.i  ij  ij  ij  ft  ft  ft 

i 22 . ft 3 ft 

37.33ft 

iER  PROPERTY  COMMAND 

OR  7 TL  1 

5 3 ft ; WE 

Network  Analysis  Corporation 
C'AGE  i : 2: 


♦ ♦Nf 

MODE  PA  I P FP  CPlP  CLDM  rrj-- 

CM-co^'c  crin-r^”.1'1  * ‘ * U . IJ  ft  l.l  1 1 l.l  1 .11.1  ft.  ft  ft  ft  . 0 Ij  I'l  . I'l  I'l 

EM  i EF  PROPER  I V COMMAND  DP  7 

-L  15  wO  FP-i  . CAP*5ft*WE  7PV  ALTERNATC  pci  I 
♦♦MITER"  

EMiEP  PROPERTY  COMMAND  OP  7 Tr 

EN7EP  'I  YSTEM  COMMAND  OP  ~ p 

EN7EP  PP0PEP7V  COMMAND  OP  ' yj_  ^ :.g  nc, 


FLOM  C O'!  7 

0 . 0 Ij  ij  . ft  ft 


NODE  PAIR  FP  CAP  Cl  OM  rnr- 

EN7EP  PROPERTY  COMMAND  " no  ' ' " " " 5L'‘°'J  U * L"J  0 * 00 

fsss^rs^i  skt  ,je  50  td  "«*«. 

ONE  OF  7HE  FOLLOWING: 

p PUN  RELIABILITY*  LP  LI  *7  c'Ac'AMcr7crPr 

CP  CHANGE  PARAMETER;  DP  DEFAULT  c-qp^^cTcpc 

1 7AEULAR  OUTPUT?  G GRAPHIC  O'  17*=' IT 

■c-l-  WPI7E  P.JC  FILE  FOP  rrN  *?i 
EN7EP  COMMAND  OP  CP  •• 

LP 

PELIAEILI7Y  PARAMETERS  APE: 

N'.AMP  i 00  NUMBER  OF  CAMPLE:" 

’ 77^'  0 RANDOM  NUMBER  CEED 

ri!=,:':p  1 . Ci  Ci  Oft  ft  ft  ft  PROBABILITY 

ENTER  RELIABILITY  COMMAND  OR  ? ,CP 

ErtlEP  KEYWORD  > VALUE  OP  * . NS  AMP  iftJME  REDUCE  NUMBER  Or  tamo,  c*  IN 

VALUE  OF  NS  AMP  CHANGED  FROM  ^ ”VE  ' 

ENTER  KEYWORD  .VALUE  OP  7 * 1 C 

ENTER  KEYWORD  OP  CR ■ < 

ENTER  RELIABILITY  COMMAND  OR  ? NOW  RUN  THE  RELIABILITY  ANALYSIS 

COMPUTATIONS  PROCEEDING 

ENIER  RELIABILITY  COMMAND  OR  ? <T?ASK  FOR  TABULAR  OUTPUT 
ENiER  OUTPUT  COMMAND  OR  7 < ? J-'I.  -U 

ONE  OF  THE  FOLLOWING: 

~ ~ 1 : TO  TELETYPE  ? F 73  pj-j,;’  c ji  c 

<CR  CP > TO  CANCE'  ' *"*“ 

ENTER  OUTPUT  COMMAND  OR  ? •:  TTY : 
tL 
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MAC  NS 

CENRP 

ID.il 

mm-i 

PEL  IRE' 

IL I TV 

<=  "Jf4  : 

NUNBEP  GF 

NGDES" 

= 42 

c-c 

D- 

i'i 

Mm: 

. PPDB . - 

L . ( 

j n i'i  0 i'i 

HIT 

'■ P T 

n CGN! 

riN'JE 

NGN  2 O-NRY-74  3 : 35RM 


M'JMBEP  GF  EPRNCHES-  4? 
MUMBEP  OF  St-tMPLE'- 

p p □ b . i n c p . - o . j 5 m o i‘i 

n in  stop  < 
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1 0 


EKPECiED  FPRCTIDN  GF  NODE  PRIES 


NG  i C DNMUN I CRT  I NR 


frctgf 

EXPEL  1 R 1 IGN 

II . II l.l 

u . i.i i.i ij  ii 00  OEf  on 

i'i  . 05 

0. 29349 OiE- 01 

I'I . i 0 

0 . i 429733Ef 0 0 

I'I . 1 5 

0 . 243 039 4Ef  0 0 

0.2  0 

0 . 4 1 0453 OEf 0 0 

'1 .25 

0 .6  305  459Ef 00 

0.3  0 

0 .703 434 3Ef 0 0 

i'i . 35 

II  • t i i 5 1 0 0 

0 . 4 i'i 

i.i  • *r*55  4 1 1 U7E^  0 0 

0 . 45 

0 .9049942Ef on 

0 . 5 0 

0 .9272933Ef on 

0.55 

0 . 9 3 46 1 09Ef 0 0 

0.6  i'i 

0 .949361 2Ef 00 

0.6C 

0 .961 091 SEf 00 

U • i l.l 

|J . 969 9 1 3 7Ef  0 0 

0.75 

0 ,976422SEf 00 

'J . 3 0 

0 .93  3 0 43  OEf  00 

0 .35 

ij  • 9332*95Ef  0 o 

u . - 0 

0 . 99 1 936 1 Ef  0 0 

0.95 

'J  .9962334Ef  On 

HIT  CP 

TG  CONTINUE  i 7 N TG 

PPDERBILIT 

CRC  TGP 

e::pec  TRTIGN 

i'i  . 0 0 

0 . OOOOOOOEf 00 

0 . i'i  5 

0.20  0 0 0 0 OEf 0 0 

0 . ! 0 

0 .7 00 000 OEf 00 

0 . 1 5 

0 .70 00 00 OEf 00 

0 . 2 0 

0 . 1 0 ij  0 0 0 OEf  0 i 

0 .25 

ij . i ij  0 ij  0 i'i  OEf  0 1 

0 . 3 0 

0 . i 0 0 ij  0 0 OEf  0 i 

0 . 35 

0 . 1 00 000 OEf 01 

0 . 4 0 

0 . 1 00 000 OEf 0 1 

i'i . 45 

0 . 1 0 ij  0 0 0 OEf  0 1 

0 . 5 0 

0 . 1 00 000 OEf 01 

0 .55 

0 .100  0 0 0 OEf  0 1 

0 . 6 0 

Ij  . 1 00 0 00 OEf  01 

0 .65 

0 . 1 00 000 OEf 01 

0 . 7 0 

0 . i 0 0 0 0 0 OEf  0 i 

0 . 75 

0 . 1 00 000 OEf 01 

0 . 3 0 

ij  .1  00  0 0 0 OEf  0 1 

0 .35 

0.1000 0 0 OEf  0 1 

0 . 9 0 

0 . 1 00 000 OEf 01 

0.95 

0 . 1 0 0 0 0 0 OEf  0 1 

VRPIRNCE 
0 . o n oooooEf 00 
0 . 43 4 69 34 Ef 00 
0 .2440S47E+0i 
0 . 424327 1 Ef  0 i 
0. 46634 19Ef 01 
i'i  . i i 15992'Ef  01 
0 . 3 5075 6 0 Ef  0 0 
0 .6 032 07 OEf 00 
i'i  . 2992462Ef 00 
0 . i 156519Ef 00 
0.37031 07E-01 
0 . 333632 4E- 0 1 
0 . 3 i 73 1 36E- 0 1 
0 .271 1332E-01 
ii . 1 457  462E-  0 1 
0 . 1 061702E-01 
0 . 7367336E- 02 
0 . 3 424797E - 02 
0 . 23 093 1 4E- 02 
0 .46 0220 7E- 03 


□F  NET  DISCONNECTED 


VFtP  I RNCE 
0 . OOOOOOOEf 00 
0 . 1600000E-01 
0.21 00000E-01 
0 .21 0000 0E- 01 
0 . 0 0 0 0 0 0 OEf  0 0 
0 . 0 0 0 0 0 0 OEf  0 0 
0 . OOOOOOOE-t-OO 
0 . OOOOOOOE-t-OO 
0 . OOOOOOOE-t-OO 
0 . OOOOOOOE-t-OO 
0 . 00  00 OOOE-i-OO 
0 . 0 0 0 0 0 0 OEf 0 0 
0 . OOOOOOOE-t-OO 
0 . OOOOOOOE-t-OO 
0 . OOOOOOOE-t-OO 
0 . OOOOOOOE-t-OO 
0 . OOOOOOOE-t-OO 
0 . OOOOOOOE-t-OO 
0 . OOOOOOOE-t-OO 
0 . OOOOOOOE-t-OO 
4.12 


STRNDRPD  DEV 
i'i  . 0 0 0 0 0 U i'i  E-t-  00 
0 .696 1992  E-t-  00 
0 . 1 56232  i E-t-  0 1 
0 .2061  133E+01 
0 .2  i 59  495Et-0  1 
0 . 1 0564 05 E- 01 
0 .92236  44  Et  00 
0 . 776664 OEf 00 
0 .541 5221 Ef 00 
0 .3 40 076 3 Ef 0 0 
0 . 1924346Ef 0 0 
0 .1971 376Ef 0 0 
0 . 1 732 733 Ef 0 0 
0 . 1 646623Ef 0 0 
0 . 1207254Ef 00 
0 . 1 0303S9Ef 0 0 
0 .3533639E-0  1 
0 .5352176E-0  1 
0 .4306  053E- 0 1 
0 .2145275E-01 


STRNDRPD  DEV 
0 . 0 00 0 00 OEf 0 0 
0 . 1 26491  iEf 0 0 
0 . 1 44913SEf 0 0 
0 . 1 449 13SEf  0 0 
0 . 0 0 00 00 OEf 0 0 
0 . 00 00 00 OEf 0 0 
0 . 0 0 00 00 OEf 0 0 
0 . 0 0 0 0 0 0 OE  f 0 0 
0 . 0 000 00  OEf  0 0 
0 . 0 0 00 00 OEf 0 0 
0 . 0 0 0 0 0 0 OE  f 0 0 
0 . 0 00 0 00 OEf 0 0 
0 . 0 000 00 OEf 0 0 
0 . 0 000 00 OEf 0 0 
0 .0  000 00 OEf 0 0 
0 . 000000  OEf 0 0 
0 .0000 00 OEf 0 0 
0 . OOOOOOOEf 0 0 
O.OOOUOOOEfO  0 
0 . 0 00  0 0 0 OE  f 0 0 
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MON  20-MAY-74  $ :35  AM 


PAGE  2 : 1 


DONE . 

ENTER  RELIABILITY  COMMAND  DP  ? 

ENiEP  SYSTEM  COMMAND  OP  POUTJNOW  ME  DO  A POUTING  ANALYSIS 

♦♦E  NET 


FOP  THE  SAM 


ENTER  POUTING  COMMAND  OP  ? ■ 
ONE  OF  THE  FOLLOWING: 

P - PUN  POUTING  ANALYSIS 
LP  - LIST  POUTING  PARAMETERS 
CP  - CHANGE  ROUTING  PARAMETERS 


DF  - DEFAUL i POUTING  PARAMETERS 
_0  ~ QUIT  POUTING  SECTION 
iP  - CHANGE  TPAFF . REQUIREMENTS 
0 - SEE  OUTPUT  D - DRAW  NETWORK 


ENTER  POUTING  COMMAND  OP  ? LP 


POUTING  PARAMETERS  APE: 

KEYWORD  CURRENT  VALUE 


DESCRIPTION 


IT  MAX 


M H. 500000 
DELRY  0.200000 
pSnVP  0.350000 
F"jnvp  0.070  non 
NDPPS  0. 00 i 000 
-’NiPQ  I.  000  ooo 


pppgfl 

0 . 0 0 0 0 03 

THACC 

0.000100 

TIACC 

0 . 0 0 0 i 0 0 

NCAP 

CAP! 

50.000 

FIX! 

350 . 0 00 

PAT  i 

5.00  0 

CAP  2 

230 .000 

FIX2 

i 3 00.0 0 0 

PAT2 

3 0 . 0 0 0 

DESIRED  NUMBER  DF  ITERATIONS 
AVG . c‘ACKET  LENGTH  ‘ KBITS1 
AVG.  PACKET  DELAY  <:SEC> 

PROTOCOL  OVERHEAD  <■  FRACTION..* 

routing  overhead  '.fraction;.' 

NODAL  PROCESSING  TIME  >,SEC> 

UNIFORM  TRAFFIC  REQUIREMENT  '..KB-:*.' 

LINE  PROROGATION  DELAY  'SEC-MILE 

THROUGHPUT  ACCURACY 

TIME  DELAY  ACCURACY  '.SEC  ' 

number  OF  DIFFERENT  CAPACITIY  OPTIONS 

CAPACITY C K B / S > OF  CAP  OPTION  « i 

SUM  OF  ALL  FIXED  COSTS  FDR.  CAP  OPT.  if  i 

PATE-  MILE  FOP  CAPACITY  OPTION  if  i 

CAPACI TY'  KB  'S  • OF  CAP  OPTION  if  2 

SUM  OF  ALL  FIXED  COSTS'  FOR  CAP  OPT.  if  2 

PATE  MILE  FOR  CAPACITY  OPTION  it  2 


FOP  HELP  ON  THESE  PARAMETERS » ENTER  T 
ENTER  POUTING  COMMAND  OP  * <%rp' 

ENTER  KEYWORD  . VALUE  OP  HCI  c’  ;yc 

OKIE  OF  THE  FOLLOWING: 


HE  ROUTING  COMMAND  <CP> 
WANT  MORE  INFORMATION 


< KEYWORD > .<  VALUE  > 

<CP>-:..CR> 

LIST 

HELP  < KEYWORD > 

L|CI_p  pi  | 

ENTER  KEYWORD  * VALUE 


TO  CHANGE  VALUES 
TO  TERMINATE 

LISTS  KEYWORDS  AND  VALUES 
DESCRIBES  < KEYWORD > 
DESCRIBES  ALL  KEYWORDS 
OR  ? < 


ITMAX : MAX  NUMBER  OF  ROUTING 
EPATIONS  TO  SAVE  RECOMMENDED 

enter  keyword. value  or"? 


HELP  ITMAX 
ITERATIONS  DESIRED. 

MAX  IS  10.  GOOD  APPROXIMATION  AFTER  5 


VALUE  OF  ITMAX  CHANGED  FROM 
ENTER  KEYWORD. VALUE  OP  ? 
ENTER  KEYWORD  OR  <CR> 


ITMAX-2  JWE  REDUCE 
5 TO 


ITMAX  TO  SAVE  TIME 

•1' 


\ 

1 
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ENTER  POUTING  COMMAND  OP 
COMPUTATIONS  PROCEEDING 
AT  ITERATION  1*  THPUPUT= 
AT  ITERATION  2 * 7HRUPU7- 


230.993  f'E  -'S 
329.397  KB-'S 


DELAY-  0.19997  S 
DELAY  - 0.19992  S 


DESIRED  NO.  OP  ITERATIONS  PEACHED 


THPUP*  329.397  KB/S 
DELAY-  0.19992  SEC 


OF  THE  EASE  PERU  IRENE NT 
S MO 


WHICH  IS  19.153 
TOTAL  COST •=  1033 IS 

ELAPSED  TIME  : 0 MIN.  • 21  SEC  . CPU  TIME  : 0 MIN. » 10  SEC. 

CONTINUE  POUTING  WITH  MOPE  ITERATIONS  7 ENTER  Y OP  N 
N 

ENTER  POUTING  COMMAND  OP  7 0«WE  LOOK  AT  OUTPUT 
ENTER  OUTPUT  COMMAND  OP  •'  <7 

□TIE  OF  THE  FOLLOWING: 

LF  - LOOK  AT  LINK  FLOWS  AND  COSTS  OF  MOST  RECENT  PUN 
T - LOO!  AT  THROUGHPUT-DELAY  TABLE  0;  MOST  RECENT  PUN 
G - DRAW  THROUGHPUT-DELAY  CURVES 

9 - QUIT  OUTPUT 

ENTER  OUTPUT  COMMAND  OP  ? T 
SPECIFY  OUTPUT  DEVICE  OR  ?<7 
ONE  OF  THE  FOLLOWING  : 

TTY  - OUTPUT  TO  TELETYPE  F - OUTPUT  TO  DISK  FILE 

■?  - PRODUCES  THIS  LIST  0 - 9UIT  THIS  SPECIFIC  OUTPUT 

HC  - OUTPUT  TO  HARDCOPY  DEVICE  ' IF  SLAVE  IS  AVAILABLE'; 

SPECIFY  OUTPUT  DEVICE  OR  '?<F 


ENTER  5 CHARACTER  FILE  NAME-'.ROUTO 

FILE  POUTO.DAT  OPENED  ON  DISK  UNIT  - 21 
SPECIFY  OUTPUT  DEVICE  OP  ?<•' 

SPECIFY  OUTPUT  DEVICE  OR  ? 

ENTER  OUTPUT  COMMAND  OR  ? 

ENTER  OUTPUT  CONMMAND  OP  7 

ENTER  ROUTING  COMMAND  OR  7 < 

ENTER  ROUTING  COMMAND  OR  ? 

ENTER  SYSTEM  COMMAND  OR  ? 

ARE  YOU  ALL  DONE?  Y OR  N?  <Y 


CPU  TIME:  45.1  0 ELAPSE D TIME:  35  : 1 <♦ . i s> 
NO  EXECUTION  ERRORS  DETECTED 


h.'.  1 I 
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IMP AC T OF  INTER/ CTIVE  GRAPHICS 
ON  NETWORK  DESIGN 


I.  INTRODUCTION 


Recently  NAC  has  developed  an  interactive  program  that  analyzes 
packet  switching  networks  for  cost,  throughput  delay,  and  reliability 
performance.  Incut  and  output  is  handled  via  a graDhic  terminal, 
which  displays  network  topology,  allows  easy  topological  reconfigura- 
tion and  shown  curves  of  network  performance  (delay,  reliability,  etc.) 
versus  a variety  of  network  parameters  (throughput,  failure  rates, 
etc. ) . 

The  details  of  the  algorithms  used  in  the  analysis  are  presented 
in  [1]  , [2]  , and  [3]  . This  chapter  is  a description  of  the  _.nter- 
active  program  and  it's  use,  the  program's  performance,  a short 
comparison  with  an  equivalent  bath  program  and  implications  of 
interactive  graphics  on  network  design. 
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1 1 • FUNCTIONAL,  SPECIFICATIONS  OF  THE  INTERACTIVE 
imETWURK  ANALYSIS  PROGRAM 


NAC's  interactive  network  analysis  program  called  RELROUT  is 
located  on  the  (NAC)  directories  at  USC-ISI  and  BUN  for  use  by  any 
network  user.  It  runs  on  a PLP-10  using  the  TENEX  operations  sys- 
tem. The  program  can  be  run  from  any  type  of  terminal,  but  will 
only  support  graphics  on  an  IMLAC  PDS-10  graphics  display  unit. 

The  prog: am  analyzes  a packet  switching  network  for  routing 
and  reliability  performance. 

The  user  must  specify: 

• A network  topology. 

• Properties  of  nodes  and  links. 

• General  parameters  such  as  packet  size,  etc. 

The  topology  can  be  entereu  in  two  ways: 

1.  Create  a new  network  configuration  through  the 
use  of  network  editing  commands;  or 

2.  Read  an  existing  configuration  from  a previously 
generated  data  file. 

i he  second  method  of  inputting  a network  is  very  useful  when 
repeated  evaluation  of  the  same  basic  topology  are  required.  On 
an  IMLAC  terminal,  tne  user  may  move  the  position  of  the  nodes  on 
the  CRT  in  order  to  obtain  a clearer  and  more  intelligible  graphic 
representation  of  tne  network. 


5.2 


Network  Aria. lysis  Corporation 


The  user  defines  the  following  node  and  link  properties 
A . Node  Properties 


1.  Name  (Optional) 

2.  Location  (longitude  and  latitude,  for  cost  calculation 
in  the  routing  analysis. 

3.  Failure  probability  (for  reliability  analysis). 

4.  Symbol  type  -Q, A ,0,0  (for  display  purposes)  . 

Link  Properties 


1.  Capacity  or  line  speed  (for  routing  and  cost  analysis) 
2-  Failure  probability  (for  reliability  analysis). 

3.  Link  type  - solid  or  dotted  (for  display  purposes). 


After  the  user  has  defined  the  network 
properties,  he  can  request  either  a routing 
to  be  performed.  Associated  with  each  analy 
parameters. 

For  a routing  analysis  these  are: 


configuration  and 

or  a reliability  analysi 

sis  are  various  general 


s 


1. 

Average  \ acket 

delay. 

2. 

Average  packet 

length. 

3. 

Overhead 

4. 

Tariff  Structure,  etc. 
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For  the  reliability  analysis  the  paramete 


rs  are 


1*  Number  of  samples, 


2.  Random  number  seed, 


3. 


Range  of  variation  for  the  probabili tie- 


Initially  all  o£  these  parameters  are  defaulted  to  specified 
values.  However,  the  user  can  change  any  or  all  of  them.  The 
program  does  error  and  validity  checking  on  the  parameters  that 

are  changed.  Also,  upon  user  request,  a short  description  of  each 
parameter  can  be  provided. 

When  the  user  is  satisfied  with  the  values  of  the  various 

parameters,  he  can  ask  for  the  analysis  to  begin.  In  most  cases, 

t e processing  is  done  locally  in  the  PDP-11),  however,  the  reliability 

ana  ysis  can  be  performed  remotely,  as  discussed  later  in  this 
chapter. 

When  the  analysis  is  completed,  the  user  can  examine  various 

outputs.  The  routing  analysis  output  consists  of  a list  of  link 

flows,  lengths,  and  costs,  and  values  for  global  throughput,  delay 
and  cost. 

There  is  also  a table  of  network  throughput  and  delav  as  a 
function  of  relative  traffic.  Reliability  output  consists  of 
tables  of  Pnc  (probability  of  network  disconnected)  and  F (fraction 
of  node  pairs  not  able  to  communicate)  as  a function  of  component 
failure  rates.  If  the  user's  terminal  can  support  graphics  (IMLAC) , 
the  program  will  plot  curves  for  throughput  vs  delay  and  for 

Pnc  and  Fnc  of  the  most  recent  run  along  with  at  most  two  previous 
runs,  r,i ’ ~ - 


(See  Figures  1,2,  and  3) . 
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After  examining  the  output  of  the  specific  analysis,  the 
user  typically  will  do  one  of  the  following: 


: 


1*  Perform  a sensitivity  analysis  on  the  present  net- 
work configuration  by  varying  the  input  parameters. 

2.  Switch  to  tne  other  analysis  (routing  reliability 
or  vie*,  versa)  . 

3.  bait  the  topology  (add  nodes  and/or  links)  and/or 
change  node  or  link  property  values  and  proceed  with 
another  routing  or  reliability  analysis. 

4.  Lxit  the  program. 

at  any  time  during  the  interactive  session,  the  user  may  save 
his  network  topology  on  a data  file  for  use  at  a later  time.  After 

leaving  the  program,  the  user  is  returned  to  tne  TENEX  operating 
system. 

NAC's  interactive  program  assists  the  network  analyst  by: 

1.  Offering  a schematic  representation  of  the  current 
network  configuration  on  a graphic  display  device; 

2.  Allowing  interactive  error  checking  of  topology 
and  input  parameters  so  that  the  user  can  correct  the 
input  data  immediately; 

3.  Removing  tne  necessity  of  explicitly  enterirg  all 
the  data  for  each  run  and  allowing  flexibility  in  the 
order  and  format  of  input; 
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4.  Using  a flexible  command  structure,  with  numerous 
prompts  to  tutor  the  novice  user  if  he  requests  help; 
and  with  short,  quick  commands  for  the  experienced  user. 

i3.  Providing  network  performance  results  with  small 
enough  response  time,  so  that  an  effective  man-machine 
interactive  design  can  be  carried  out. 
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DELAY 

(SEC) 


throughput  o;&its/sec> 


~ HOPE  ARPANET  CONFIGURATION,  ALL  50  KB/S  LINES  VS.  T.r.r, 
OF  l CROSS  COUNTRY  CHAIN  TO  2 30  KB/S  "" 
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111  * TURNAROUND  delay  performance  of  interactive  programs 


A.  General 


Turnaround  delay  is  probably  the  most  important  performance 
criterion  for  the  evaluation  of  the  effectiveness  of  an  interactive 
analysis  and  design  program.  In  fact,  the  purpose  of  an  interactive 
program  implementation  is  to  provide  the  systems  analyst  with  faster 
answers  and,  therefore,  better  man-machine  interaction  than  he  could 
obtain  with  the  batch  version  of  the  same  program.  In  this  section, 
the  turnaround  time  performance  of  our  routing  and  reliability  inter- 
active programs  is  evaluated  in  a variety  of  load  conditions. 

Turnaround  time  is  here  defined  as  the  delay  between  the 
time  the  RUN  command  (which  starts  execution)  is  entered  and  the  time 
the  output  comes  back  on  the  screen.  In  a time  sharing  environment, 
such  a delay  is  approximately  proportional  to  the  iverage  number  of 
tasks  (system  and  user)  simultaneously  requesting  the  CPU;  in  a Tenex 
system,  such  a number  is  referred  to  as  the  Load  Average. 

B*  Routing  Program  Turnaround  Delay 

NAC's  routing  program  is  based  on  an  iterative  algorithm 
which  attempts  to  raise  network  throughput  while  maintaining  the 
specified  delay  constraint.  Therefore,  the  CPU  time  is  proportional 
to  the  number  of  iterations  performed.  In  the  following  experiments, 
five  iterations  were  allowed  for  each  run,  since  at  that  point,  a 
sufficient  accuracy  was  generally  obtained. 

Several  routing  runs  were  performed  on  three  network  ex- 
amples with  10,  26,  and  42  nodes,  respectively,  using  NAC's  inter- 
active program  at  USC-ISI.  The  same  network  was  analyzed  at  different 
times  during  the  day,  and  with  different  load  averages.  The  curves 
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4’  5>  3nd  6 Sh°W  th°  3lapsed  tine  versus  load  average  for 
each  network  application.  average  for 

suite  r , F<T,  n°tWOrkS  °n  the  °rd£r  °f  10  "°d^  the  elapsed  tine  is 
/ era  f°r  anV  reas°nablo  value  of  load  average  (dote-  the 
typroal  range  of  load  averages  is  between  2 and  8,.  P rt  erno^e 

intermediate  throughput  and  delay  results  are  printed  at  the  terminal 

co~cuta°tio„rOUtlnq  lt0rati°n'  t0  ^ the  attention  while 

computation  is  proceeding. 

note  critical  nCtW°rkS  '20.n°deS  “ m0r<!»  ’ load  average  has  a 

the  user  can  . T',  °"  elaPSed  tlne-  In  the  c'ase  of  high  load  average 
and  th  ^V01C‘  °ng  dolays  by  reducing  the  number  of  iterations 

his  rr  SaCriflCinq  accuracy;  alternatively,  the  user  can  transfer 

rr;r:  rr0r:: with  w ioad — - -»  - ~ 

sharing  network  «h  ^ ^ * 3 ™ce 

C'  £S.rfo™ance  of  Interactive  Reliability  Analysis 

as  tRPAMErs  IT-1'  thC  reliahilitV  of  a network  with  a site  such 

of  a few  links  Purthe™'7  draStlCaUy  “ith  ^ -section  or  deletion 
v;il]  . * ernore,  the  reliability  criteria,  P and  F 

not  change  at  all  if  only  line  speeds  are  varied.  Tte  routing 

performance,  on  the  other  hand,  can  be  substantially  affec  ed  e 

:a:: 1 rions: 

ev  U t n Li:  tOP°l°qiCal  m0dlfiPati-  and  routing  performance 

reliability  a^r  ZlZ’  ^ “k 

• 1 toPological  changes.  Therefore,  an  inter- 

ve  nZZ\ZZZTaqe" should  include  a reliabuity 

reliability  program  currently  included  in  NAC  • s inter 

:nT  rlysis  and  ,esign  ™ - — iy  a i::;:  10„ 

■ ogram  ,2,.  The  accuracy  of  the  results  is.  therefore,  related  to 
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the  number  of  samples  generated  in  the  simulation.  Table  1 shows 
the  results  for  several  reliability  analyses  using  1000  samples  for 
networks  of  10,  26,  and  42  nodes.  The  runs  were  made  at  ISI  at 
approximately  8:30  EDT  (5:30  PDT) . 


TABLE  1 

RELIABILITY  PROGRAM  PE RFO RMA MCE 


Load  Average 
Elapsed  Time 
CPU  Time 


10  Nodes 

1.6 

1 min.  49  sec. 
59  sec. 


26  Nodes 
1.5 

3 min.  58  sec. 
2 min.  37  sec. 


42  Nodes 
2.5 

9 min.  35  sec. 
4 min.  18  sec. 


As  can  be  seen,  the  elapsed  tine  for  the  reliability  analysis,  even 

at  very  low  load  averages,  tends  to  reach  levels  which  are  intolerable 
for  interactive  analysis. 

Because  of  the  extensive  amount  of  computation  required  for 
a reliability  analysis,  it  would  be  desirable  to  have  the  analysis 
be  done  in  batch  on  a big  "number  crunching"  machine,  and  yet  lllow 
the  user  the  flexibility  of  interactive  editing  and  validation  of  data 
and  graphic  display  of  output.  This  feature  can  actually  be  implemented 
on  ARPANET.  in  fact,  with  the  remote  job  service  (RJS)  in  association 
with  the  IBd  360/91  at  UCLA,  the  user  does  have  the  option  of  local 
or  remote  processing  of  the  reliability  analysis.  He  can  direct  the 
interactive  program  to  perform  the  analysis  locally  by  entering  the 
command  R(UN).  Or,  he  could  request  the  interactive  program  to  create 
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an  TJS  data  file  complete  with  the  job  control  lanquaqe  (JCL)  needed 
execution.  The  user  can  then  leave  the  interactive  proqrarn, 
enter  the  RJS  subsystem,  and  submit  the  reliability  job  file.  He 
can  wait  for  the  output  or  he  could  return  to  the  interactive  pro- 
gram and,  perhaps,  continue  with  a routinq  analysis;  periodically 
checking  to  see  if  the  IB*!  360/91  has  completed  the  execution  of 
the  reliability  analyses. 

Once  the  output  is  ready,  the  data  file  is  read  into  the 
interactive  proqrarn  and  various  reliability  curves  can  be  displayed. 
Efforts  are  currently  being  made  to  make  this  RJS  feature  invisible 
to  the  user.  However,  even  now  this  feature  of  parallel  processing, 
made  available  by  ARPANET  resource-sharing  capability,  is  very  useful 
for  any  interactive  computation. 
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Iv*  INTERACTIVE  versus  batch  network  design 

It  is  possible  that  a one-time  network  evaluation  could  be 
performed  faster  and  more  economically  in  a batch  mode  than  in  an 
interactive  mode.  However,  what  we  are  concerned  with  here  is  a 
complete  design  session  in  which  a network  is  analyzed  and  modified 
repeatedly  using  man-computer  interaction.  In  this  section,  we 
compare  the  overall  length  of  time  involved  in  a design  session 
using  either  a batch  program  or  an  interactive  program.  A design 
session  consists  of:  analysis  of  a given  network  configuration; 

use  of  results  to  modify  the  topology  in  order  to  improve  network 
performance  (reduce  cost,  improve  throughput  and  reliability)  ; re- 
evaluation  of  the  new  network  configuration,  etc. 

Table  2 shows  the  steps  a designer  would  take  in  a "design 
session",  evaluating  routing  performance,  using  an  interactive  or 
a batch  approach. 

In  Step  II,  the  user  sets  up  interactively  the  network  con- 
figuration and  the  data  base  internal  to  the  program.  In  the  batch 
case,  the  designer  must  first  draw  the  configuration  and  then  key- 
punch the  appropriate  cards.  The  setting  up  of  the  routing  para- 
meters (12  and  B2)  requires  basically  the  same  amount  of  time  in 
both  interactive  and  batch  mode;  however,  the  interactive  program 
displays  on  the  screen  a description  for  each  input  parameter,  and 
performs  validity  and  error-checking  on  the  input  data,  thereby 
avoiding  error  and  confusion. 

In  Step  13,  the  user  enters  the  R(UN)  command  and  execution 
of  the  routing  analysis  begins.  In  the  batch  case,  (B3.1  -*■  B3.5), 

-he  user  initiates  the  job  and  receives  the  output.  Even  though 
the  actual  batch  computation  is  generally  faster  than  the  interactive 
computation,  (5  sec.  for  a 42  node  network  on  a CDC  6600  as  opposed 
to  24  sec.  on  a PDP  10  Tenex) , there  is  quite  a fixed  delay  before 
and  after  job  execution  which  is  independent  of  CPU  time  requirements 
(read-in  deck,  job  waiting  on  input  and  output  queue,  wait  for  printer). 
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This  delay  can  be  on  the  order  of  15  - 20  minutes,  sometimes  even 
lo.nqer . 

In  Step  14  or  B4 , the  designer  examines  the  results  of  the 
analysis.  The  time  required  to  examine  the  data  on  the  screen  is 
comparable  to  the  time  required  to  read  the  same  data  on  the  print 
out;  however,  in  the  interactive  mode,  the  designer  can  instantly 
compare  the  present  results  with  those  obtained  from  previous  runs 
(most  notably  plots  of  throughput  versus  delay) . After  the  topologi- 
cal modifications  are  made  on  the  configuration  displayed  on  the  CRT 
(Step  15),  the  designer  is  ready  to  perform  another  analysis  and 
returns  to  Step  13.  In  the  batch  case,  on  the  other  hand,  the  designer 
must  redraw  his  new  configuration  by  hand  and  then  keypunch  the  aoDro- 
priate  cards.  He  then  must  return  to  the  card-reader  to  initiate  a 
new  analysis. 

Thus,  for  the  first  iteration  of  the  analysis,  the  effort  on 
the  designer  is  comparable  for  both  the  interactive  and  batch  approaches. 
However , for  the  subsequent  iterations,  the  effort  is  considerably  less 
with  the  interactive  procedure. 
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INTERACTIVE 
'reate  Topolog. 

On  Graphics  II 
Terminal 


BATCH 

Draw  Map  Manually 


B1.2J  Keypunch  topology 


Set  Up 
Routing 
Parameters 


keypunch  Routing 
Parameter  Cards 


Read  in  Deck 


Enter  R (un) 
Command,  Job  13 
Executes 


Designer  Looks 
At  Results  14 
On  Terminal 


Makes  Modifi- 
cations of 

Topology  on 
Display  Unit 


B3.2  Job  Waits  On  Input 
Queue 


B3.3  Job  Executes 

B3.4  Job  Waits  On  Output 
Queue 

ZJl  ~ 

B3.5  Printer  Prints  Out 
Results 

Designer  Looks 
At  Results 

- t ■-  ZI 

2 Make  Modifications 
On  M aw 

B5.2  Charges  Topology 


TABLE  2 
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V. 


IMPACT  OF — INTERACTIVE  GRAPHICS  ON  NETWORK  DESIGN 


The  most  conspicuous  effect  of  interactive  graphics  on  network 
osiqn  is  that  of  speeding  up  the  overall  design  process  in  a very 
substantial  manner.  The  amount  of  time  saved  and  the  increase  in 
productivity  clearly  depend  on  many  factors  (efficiency  of  the  inter- 
active graphic  system,  programmers  experience,  load  on  the  comouter 

etc.)-  our  experience  indicates  that  the  design  process  is  typically 
speeded  up  by  5 to  10  times.  ' Y 

, . An0ther  imP°rtant  effect  of  interactive  graphics  on  network 

sijn  is  the  interactive  use  of  analysis  programs  to  develop  better 

cesign  algorithms.  In  fact,  most  network  design  algorithms  are  based 

on  heuristics,  and  good  heuristics  are  obtained  by  combining  physical 

-n  uitlon,  careful  observation  of  several  network  orooerties,  and 

eva  nation  of  several  examples.  Therefore,  an  interactive  orogram  is 

an  extremely  valuable  tool  in  the  hands  of  a network  designer  who  is 

rying  to  establish  experimentally  some  general  relationshio  between 

network  configuration,  topological  transformations  and  network  ner- 
formance.  ‘ * 

Besides  assisting  the  network  designer  in  the  develooment  of 

etter  heuristics,  interactive  graphics  can  also  provide  a way  to 

monitor  design  algorithms  which  are  completely  automatized,  and  in 

principle  would  not  require  human  intervention.  Since  most  network. 

design  algorithms  are  iterative  and  typically  perform  a topological 

ransformation  at  each  step,  it  is  possible  to  imolement  them  in 

an  interactive  node  so  that  they  display  the  current  solution  at 

oac.  iteration,  l.e  designer,  therefore,  can  evaluate  the  cost 

effectiveness  of  each  transformation,  and  can  stop,  correct  and 

restart  the  algorithm,  whenever  he  identifies  some  inadecuacv  in  the 
current  solution.  ' e 
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The  monitoring  and  verification  function  of  interactive  graphics 
is  very  valuable  for  any  heuristic  design.  In  fact,  it  is  extremely 
rare  to  find  heuristics  that  perform  well  on  all  possible  problems. 

In  general,  there  are  always  cases  in  which  the  heuristic  solution 
xs  very  bad.  Sometimes  the  solution  is  so  bad  that  the  designer  can 
detect  and  correct  the  inconsistencies  by  simple  visual  inspection. 
Thus,  the  importance  of  visually  monitoring  the  solutions. 
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VI.  EXAMPLE 


A practical  application  of  the  interactive  program  arose  when 
NAC  was  asked  to  evaluate  the  possibility  of  significantly  reducing 
the  communication  costs  of  ARPANET  and  the  impact  of  potential  cost 
reductions  on  the  network's  performance.  Below  is  the  report  sum- 
marizing the  results. 

Requirements  used  in  the  study  are  as  follows: 

1.  Average  packet  delays  under  0.2  seconds 
throughout  the  net. 

2.  Capacity  for  expansion  to  64  IMPs  without 
major  hardware  or  software  redesign. 

3.  Average  total  throughput  capability  of 
200  - 300  kilobits/second  for  all  Hosts. 

4.  Peak  throughput  capability  of  40  - 80 
kilobits/second  per  pair  of  IMPs  in  an  otherwise 
unloaded  network. 

5.  High  communication  subnet  reliability 
subject  to  economic  constraints. 

The  time  delay  and  throughput  requirements  imply  that  50  kilobit/ 
second  communication  lines  are  needed  within  the  network.  Factors 
impacting  net*ork  designs  have  been: 

1.  Nine  month  lead  times  for  obtaining  lines 
from  AT&T. 
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2.  Rapid  expansion  of  the  number  of  IMPs  and 
TIPs  in  the  net  (averaging  one  new  node  per 
month) . 

3.  Rapid  increase  in  traffic  in  the  network. 

The  traffic  growth  in  the  net  prompted  us  to  study,  within  the  present 
contract  year,  the  problem  of  increasing  the  network's  traffic 
capacity  and  the  associated  costs  of  such  increases.  Now  that  the 
network  traffic  has  become  relatively  stable  because  of  the  satura- 
tion of  serving  Hosts  and  the  reduction  of  the  rate  of  addition  of 
new  nodes  to  the  network,  the  object  of  the  present  study  was  to 
reevaluate  network  costs  as  a function  of  the  various  parameters  in 
the  network. 

Several  options  are  available  to  reduce  cost  in  the  network. 

These  are: 

1.  Rearrangement  of  lines. 

2.  Reduction  of  line  capacities  in  the  network 
either  on  a limited  basis  or  throughout  the  net- 
work . 

3.  Introduction  of  new  technology  to  reduce 
overall  line  costs. 

A constraint  imposed  on  all  alternatives-,  is  that  communication  subnet 
reliability  should  not  significantly  decrease  from  that  of  the  present 
network.  This  constraint  dictates  that  the  network  remain  two  con- 
nected and  imposes  certain  other  technical  conditions  upon  network 
topology. 
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Major  modifications  to  the  present  network  cause  reduction  to 
network  performance  with  respect  to  one  or  more  of  the  network's 
performance  parameters.  A simple  rearrangement  of  lines  to  reduce 
cost  also  reduces  the  network's  expansion  capability  for  at  least 
nine  months  (since  this  is  the  time  required  to  obtain  new  50  kilobit/ 
second  circuits  from  AT&T).  Throughput  is  also  somewhat  reduced. 

Decrease  of  line  capacities  substantially  reduces  total  network 
throughput,  increases  time  delay,  and  most  significantly,  theoretically 
increases  file  transfer  time  by  at  least  a factor  of  2.6  (based  on 
available  AT&T  circuit  options) . The  cases  where  some  or  all  of  the 
lines  are  reduced  in  speed  were  considered. 

If  lower  speed  lines,  (in  particular,  19.2  kilobit/ second  lines) 
are  used,  the  use  of  a new  device  called  a biplexer  to  further  re- 
duce the  cost  of  some  lines  s possible.  Although  many  have  not  yet 
been  installed  and  thoroughly  tested,  the  technical  concept  is  sound 
and  should  a decision  to  reduce  line  speeds  to  19.2  be  made,  several 
biplexers  should  be  obtained  and  tested,  and,  if  successful,  used 
wherever  appropriate.  The  analyses  consider  networks  both  with  and 
without  biplexers. 

Finally,  the  case  where  line  speeds  are  decreased  to  9.6  kilobits/ 
second  was  considered.  This  results  in  a network  which  can  only  mar- 
ginally handle  existing  network  traffic,  has  no  expansion  capability, 
very  high  average  time  delays  (over  1 second)  and  file  transfer  times 
more  than  five  times  as  great  as  those  achievable  at  present.  While 
the  communication  costs  of  such  a network  would  only  be  1/3  of  the 
present  networks  cost,  its  performance  would  be  so  poor  that  this 
option  was  not  extensively  examined. 

Figure  7 summarizes  the  available  alternatives  and  communication 
costs.  Options  that  we  prefer  given  that  a decision  is  made  to  de- 
grade ARPANET  performance  are  also  indicated  as  are  the  disadvantages 
of  each  alternative.  An  important  point  to  be  understood  is  that 
communication  costs  do  not  reflect  all  of  the  issues  that  must  be 
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made  in  changing  the  present  ARPANET  approach.  For  example,  if  all  of 
the  network's  lines  were  reduced  from  50  kilobits/second  to  19.2 
kilobits  per  second  (and  the  network  were  reconfigured) , communication 
line  costs  would  decrease  from  about  $1.1  million  per  year  to  either 
$736,000  (without  biplexers)  or  $694,000  (with  biplexers).  However, 
file  transfer  time  would  increase  by  a factor  of  2.6  and  the  computer 
connect  time  costs,  to  transfer  these  files,  could  increase  by  as 
much  as  $72,000  per  year.  (These  calculations  are  based  on  available 
network  measurement  data) . 

Using  the  interactive  program,  this  study  took  approximately 
four  hours.  Using  programs  that  run  in  batch,  this  study  could  have 
taken  several  days. 
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FIGURE  7 
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VII.  FUTURE  RESEARCH 


Future  research  plans  in  the  area  of  interactive  network  design 
tools  include  the  following  items: 

1.  Implementation  of  the  Cut-Saturation  Network 
Design  Algorithm  (4]  as  an  interactive  program. 

2.  Development  of  more  efficient  and  less  time 
consuming  reliability  analysis  algorithms,  which 
would  allow  more  frequent  reliability  evaluations 
during  the  design  process. 


1 
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COMPUTATIONAL  COMPLEXITY  OF  NETWORK 
CONNECTIVITY  ALGORITHMS 


I.  INTRODUCTION 

A basic  property  of  a graph  is  its  connectivity  structure; 
that  is,  how  many  connected  pieces  it  divides  into.  The  determina- 
tion of  this  simple  property  is  fundamental  to  many  more  complex 
calculations.  It  is  equivalent  to  determining  equivalence  classes. 

One  important  application  is  in  network  reliability  calculations. 
Suppose  we  are  given  n nodes  for  a network.  There  are  n(n-l)/2 
possible  distinct  undirected  links  (not  counting  loops)  among  them. 
Suppose  we  add  these  links  sequentially  and  in  random  order.  What 
is  the  expected  number  of  links  we  must  add  to  connect  all  n nodes? 

This  value  allows  one  to  estimate  a bound  on  the  average  running 
time  for  a simulation  technique  for  reliability  analysis  [23].  In 
this  chapter  we  describe  closed  form  solutions  to  this  and  related 
problems  for  finite  n.  The  method  used  to  consider  the  process 
as  Markov  processes  on  the  lattice  of  partitions. 

The  problem  we  discuss  arises  in  the  evaluation  of  algorithms 
for  determining  spanning  trees  and/or  the  connected  component 
structure  of  graphs  and  networks.  Some  of  these  algorithms  are 
given  in  [9],  [13],  [14],  and  [22].  Mathematically,  the  problem 
has  its  own  history  and  seems  to  have  appeared  (in  the  form 
studied  here)  for  the  first  time  in  [6],  although  forms  of  it 
were  studied  earlier  in  [10]  and  [11], 

The  problem  can  be  succinctly  described  as  follows:  At  each 

point  in  "time"  (sequentially)  we  select  a subset  of  size  two  (edqe) 
from  a set  of  n-labelled  objects  (nodes,  vertices)  "at  random".  At 
random  means  that  each  edge  not  previously  selected  is  equally  likely 
to  be  picked.  A graph  generated  "at  random"  is  called  a "random  graph". 
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What  is  the  probability  that  the  "random  graph"  is  connected  after 
1 -edges  have  been  selected?  What  is  the  "mean  time"  to  connectedness? 
More  generally,  given  a connected  component  structure  tt  , what  is  the 
mean  time  to  obtaining  a graph  with  component  structure  it.  We 
obtain  closed  form"  solutions  and  relatively  simple  computing 
equations  for  the  problems  just  posed  as  well  as  other  combinatorial 
functions  related  to  the  process  of  generatinc  random  graphs.  A 
number  of  these  results  have  been  obtained  by  other  methods  by  the 
authors  mentioned  and  others.  Our  methods  are  based  on  the  work 
in  [20]  , [21]  . 

The  problems  described  above  have  a long  and  interesting  history 
going  back  to  1956  and  to  questions  about  trees  in  the  late  nine- 
teenth century.  The  problem  of  determining  the  probability  that 
a random  graph  has  a given  connected  component  structure  was 
apparently  first  posed  in  [6].  The  question  of  determining  the 
number  of  connected  graphs  on  n-nodes  with  j edges  was  posed  and 
solved  in  terms  of  generating  functions  in  [19],  [11],  [1]  and 
[2]  and  perhaps  elsewhere  between  1959  and  1971. 

In  analysing  the  computational  complexity  of  algorithms  for 
determining  the  connected  component  structure  of  a graph,  it  is 
useful  to  know  the  mean-time  to  obtaining  that  structure  as  well 
as  the  sojourn  time  in  various  structures  sequentially  obtained. 

To  our  knowledge,  these  problems  have  not  been  discussed  in  the 
literature.  In  the  special  case  of  a connected  graph  with  (n-1) 
edges  we  ask  for  the  number  of  trees  on  n-vertices.  This  question 
has  its  own  history  starting  with  [4],  [5],  [3],  see  especially  [15]. 

Some  results  of  these  authors  are  obtained  herein  as  special  cases 
of  our  results. 

In  [21]  the  suggestion  was  offered  that  a number  of  probability 
questions  are  more  "naturally"  posed  and  solved  by  mapping  sample 
spaces  into  semilattices  (semilattice  variables)  rather  than  to  the 
real  line  (random  variables).  In  some  examples,  the  distribution 
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function  of  the  mapping  (called  a generating  function)  can  be  de- 
termined and  inverted  using  the  Mobius-Rota  inversion  therorem 
(see  e.g.  (20])  to  obtain  the  density  function  (distribution 
function  in  [21])  of  the  mapping.  Indeed  this  is  a useful 
method  for  obtaining  the  probability  tiiat  a "random  graph"  with 
j-edges  has  a given  connected  component  structure.  The  approach 
yields  a complicated  but  closed  form  formula  from  which  the 
generating  function  is  easily  obtained.  Furthermore,  we  show 
that  the  process  of  adding  edges  at  rardom  is  a Markov  process 
and  hence,  that  theory  can  be  used  to  study  the  generation  of 
random  graphs.  It  is  easy  to  obtain  the  probability  transition 
function  for  the  Markov  process,  which  turns  out  to  be  non-s tationary . 
None  of  the  formulae  obtained  are  useful  for  computing  so  we  close 
the  paper  with  a pair  of  coupled  equations  which  can  be  used  for 
computing  the  combi  itoria1  functions  of  interest.  This  is  carried 
out  in  Sections  III-IV.  Notations  and  basic  definitions  are 
introduced  in  Section  II. 

The  question  how  "connected"  is  a graph  has  been  subjected  to 
numerous  definitions.  The  commonly  accepted  definition  i'  the  size 
of  a minimal  cut-set.  There  are  many  reasons  why  this  is  not  a 
particularly  good  definition.  Two  obvious  reasons  are: 

1)  It  can  not  be  applied  to  disconnected  graphs,  although 
obviously  some  disconnected  graphs  are  more  connected  than 
others . 

2)  Many  connected  graphs  have  the  same  size  minimal  cut-set 

although  the  complexity  of  their  internal  structure  is  quite 
different. 
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A definition  of  "connectivity"  which  would  d i i i erentiate  be- 
tween connected  graphs  by  measuring  their  interna]  structure  and 
also  differentiates  between  disconnected  graphs  would  seem  to  have 
many  obvious  merits  and  applications.  We  propose  and  discuss  such 
a measure  in  Sections  VI-X1. 
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11  • THE  LATTICE  OF  PARTITIONS 

A lattice  is  a partially  ordered  set  where  each  pair  of 
elements  has  a greatest  lower  bound  and  least  upper  bound  (g.l.b.  and 
l.u.b.)  We  denote  the  statement  that  a is  less  than  or  equal  to  b 
in  the  partial  ordering  by  a < b.  In  a lattice  as  on  the  real  line 
we  have  three  types  of  intervals,  or  segments.  (x.vMx.v)  . (x.y) 
which  are  respectively  the  sets  of  a's  for  which  x < a < y;  x < a < y; 

* a L y.  it  is  easy  to  see  that  [x,y|  is  a sublattice  of  the  original 
lattice  when  endowed  with  the  same  ordering.  An  element  b is  said 
to  cover  an  element  a when,  a<x<b  is  satisfied  by  no  x,  or  equiva- 
lently when  the  interval  Ia,b]  contains  exactly  two  elements.  A 
Hasse  diagram  is  a pictorial  depiction  of  a lattice  where  the 

elements  are  represented  as  points  and  a line  is  drawn  from  b to 
a when  b covers  a,  e.g. 


n element  of  a lattice  is  minimal  (maximal)  if  it  covers  no  element 
(is  covered  by  no  elements).  A unique  minimal  (maximal)  element  is 
called  a zero  or  least  (unit  or  greatest) . When  the  aero  or  unit 
exist  they  are  denoted  by  0 an  I respectively.  An  atom  is  an  element 
which  covers  a minimal  element  while  a dual  atom  is  an  element 
covered  by  a maximal  element. 


6.5 


NcLwoi'k  A)Lil'j::Li;  rj  fuLioti 


Lot  be  a finite  set  with  n-labelled  objects.  A r art  it  ion 

of  Vn  is  a family  of  disjoint  subsets  of  V^,  say  P } , . . . , p 

celled  parts  or  blocks  of  , whose  union  is  V . A natural  ordering 

on  the  set  ^ of  partitions  of  V is  given  by  if  every  block  or 

part  of  ■ is  a subset  of  some  part  or  block  of  . It  is  not  difficul 

to  verify  that  the  set  n of  partitions  of  Vn  is  partially  ordered 

with  tlie  ordering  just  defined.  In  fact,  is  a lattice  with  that 

n 

oiuering . Furthermore,  the  lattice  fl  has  a zero  0,  or  least  element 

given  by  the  partition  with  n-singleton  parts.  The  lattice  also 

has  a unit  I greatest  element  given  by  the  partition  containing 

one  block  V itself.  The  following  result  about  covers  in  ' is 
n n 

obvious . 

Lemma  2 A : An  element  - n covers  "f/  if  and  only  if  is  obtained 
by  combining  any  two  parts  or  blocks  of 

The  lattice  ^ is  endowed  with  a very  useful  rank  or  numerical 
ordering  as  will  be  evident  from  an  examination  of  the  Masse  diagram 
for  small  n,  or  from  Lemma  2A.  A partially  ordered  set  is  said  to 
satisfy  the  Jordan-Dedek ind  chain  condition  if: 

1.  It  has  a zero  and  unit. 

2.  All  totally  ordered  subsets  having  a maximal 
number  of  elements  have  the  same  number  of  elements. 

A totally  ordered  subset  of  a partially  ordered  set  is  called  a 
chain,  a>b  ...  w,e.g..  A chain  is  maximal  if  it  cannot  be  en- 
larged. Khen  a lattice  satisfies  the  Jordan-Dedek i nd  chain  condition 
one  can  introduce  a rank  function  R (p)  on  the  lattice.  The  functions 
r(p)  is  defined  as  the  length  of  a maximal  chain  in  the  segment 
[o,p]  minus  one.  The  rank  of  zero  is  zero,  the  rank  of  an  atom  is 
one,  etc.  If  the  rank  of  I is  n-1  then  the  rank  of  a dual  atom  is 
n-2.  It  is  easy  to  verify  that  H satisfies  the  Jordan-Dedek ind 
chain  condition  and  indeed  the  rank  of  a partition  is  n minus 

the  number  of  blocks  or  parts  of  it. 
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Another  fundamental  descriptive  combinatorial  notation  for 
partitions  is  the  t*£e  or  class  of  tt.  a partition  tt  is  of  type  or 
class  (k1,k2, ..  . ,kn)  when  k..  is  the  number  of  block?,  or  parts  with 

i elements.  Obviously, if  Tre^  then  L k^n-riv)  where  r(')  is  the 
ank  function,  and  £^ik^=n.  The  following  structure  lemma  is 
fundamental  for  forming  recurrences  on  n 

n * 


iS2a_2B.  Let  n be  the  lattice  of  partitior  of  a set  with  n-elements, 
Jf  • nls  of  rank  k then  the  segment  or  interval  [it, I]  is  isomorphic 

to  "n-k-  Xf  » is  °f  class  of  type  (k1,k2 k ) then  the  interval 

[O.it]  is  isomorphic  to  the  direct  product  of  k£  lattices  isomorphic 
to  1(  k2  lattices  isomorphic  to  H., kn  lattices  isomorphic  to  n . 

Among  other  applications  of  lemma  2B  it  was  shown  in  [20]  " 

that  it  can  be  used  to  compute  the  Mobius  function  of  n . The  Mobius 
function  was  shown  by  Rota  to  be  an  important  invariant*  lattices 

and  hence  a distinguished  member  of  the  incidence  algebra  over  a 
partially  ordered  set. 

If  P is  a partially  ordered  set,  the  incidence  algebra  of  P de- 
noted by  I (P)  is  the  algebra  of  real  valued  functions  f:PxP  ->R;~3x/y — 
f(x,y)  = 0;  addition  and  scalar  multiplication  are  defined  as  usual 
and  the  product  (convolution)  is  given  by  e=f*g,  e,g,fel(p) 

e(x,y)=  £f(x,z)g(z,y). 

xlz£y 

It  is  easy  to  verify  that  I (p)  is  an  algebra.  The  convolution  becomes 
essentially  matrix  multiplication  when  P is  finite.  Among  some  dis- 
tinguished elements  of  the  incidence  algebra  are  the  functions: 

zeta  function  : C(x,y>-1  if  x<y  and  0 otherwise 

delta  function:  « (x.y)-l  if  x=y  and  0 otherwise 

incidence  function:  n(x,y)«  ?(x,y)  - S(x,y). 
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Since  the  delta  function  is  easily  verified  to  be  an  identify  in  I (P) 
certain  elements  of  I (P)  will  be  invertible  in  I (P)  with  respect  to  6. 

The  next  and  last  set  of  definitions  can  be  found  in  [21] 

Let  S be  an  arbitrary  finite  set  and  W (•)  be  a weight  function  on 
S.  (S  can  be  thought  of  as  a sample  space  and  W (•)  a probability 
measure).  Let  n be  a lattice.  A mapping  X:5+n  is  called  a semilattice 
variable  analogous  to  probability  theory  and  random  variables. 

Let  A (II)  be  the  algebra  of  singlevalued  functions  f:JI+R  with  addition 
and  scalar  multiplication  as  usual,  and  e=f*g,  f,g,eeA(TI)  given 
by 


e( = l f (a)g(b)  , 
aAb=x 

where  aAb  is  the  g.l.b.  of  the  pair  {a,b}.  With  this  notation, 
if  the  weight  function  is  a probability  density  on  S then  there 
is  a function  fe  A (D  which  represents  the  density  function  of  a 
semilattice  variable  X given  by 

f (it)  = ZW(t) 

t:  X(t)=TT 

The  generating  function  F (probability  distribution  function)  of 
f is  given  by 


F(o)  = Z f (tt)  = Z f ( 7t ) ; 

tt < o ire  [o  , a ] 


and  plays  the  role  of  a probability  distribution  function.  The 
density  function  and  distribution  functions  can  be  computed  from 
each  other  by  the  Mobius  Rota  inversion  formula.  This  approach  is 
useful  in  "random  graph"  theory.  For  other  examples  see  [21]. 
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Ltji  the  set  = ' v^  , v2 , • • • , v^  y denote  a set  of  n-nodes  or 

vertices  of  a graph.  Let  n = • (e^ ,e2 , . . . ,e ^ is  a subset  of 

size  two  from  be  the  set  of  sequences  or  all  permutations  of  the 
edges  of  the  complete  graph  on  Vn . We  define  the  lattice  stochastic 

process  X1 , X2  , . . . ,X  (" ) ; by  (e)  = (e1  ,c*2  , . . . , e (n } ) = 

the  partition  of  determined  by  the  connected  components  of  the 
graph  (vn  > »e2  , . . . ,e^  ) ) ; j = l,  2,....,  (2)  . Thus  for  examole  X^(e) 

is  always  a partition  of  rank  1 and  type  (n-2 , 1 , 0 , 0 , . . . , 0 ) ; X2(e) 
is  always  a partition  of  rank  2 and  (n-2)  parts  but  can  be  of  two 
possible  types  (n-3, 0,1,0, 0)  or  (n-4 , 2 , 0 , 0 , . . . , 0 ) . In  fact 

for  ee_h  e*-_  n,  X^ (e)_X2 (e)£. . ,_X  ( q ) = I so  that  th^  process 

2 ' 

is  nonotonic  nondecreasing  in  the  natural  orderina  on  ' . Furthermore, 
each  pair  (X^  (•),  (•))  has  X^+^(e)  = X . (e)  or  is  a cover  of 

Xj (e)  so  that  no  wild  jumps  take  place. 

We  assume  that  each  e£<  has  the  same  orobabi  1 it'’— so  that 

n 4 ,n  * i 

we  can  define  the  density  function;  2 ' 


f j (~)  = P{X.  = ~ f = 


no.  of  eO  ^ for  which  X4 (e)  = 


<2 ) * 


and  the  one  step  transition  function. 


Thus  if  X (e ) then  X j + ^ (e ) if  the  i j + 1 ) edge  of  e 

is  a subset  of  some  part  cf  - , otherwise,  Xj+^(e)  is  a cover  of 
it  obtained  by  combining  the  two  parts  of  tt  which  contain  the  points 
°f  e]+l*  The  next  theorem  should  be  fairly  obvious. 
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Theorem  3A:  The  semi lattice  process  Xj,X2,...  is  Markovian. 

The  state  1 is  absorbing  and  the  process  is  absorbed  with  probability 
one . 

Proof ; The  state  I is  absorbing  since  eventually  the  granh  becomes 
connected.  To  see  that  the  process  is  Markovian  we  observe  that 
the  probability  of  moving  to  any  state  depends  only  on  the  number 
of  edges  in  the  graph  and  the  type  of  the  partition.  The  number 
of  ways  of  staying  in  the  same  state  is  computed  by  selecting  edges  not 
already  selected  from  within  the  blocks  of  n , this  is  independent 
of  the  history.  The  number  of  ways  of  moving  to  any  given  cover 
depends  only  on  the  ..izes  of  the  parts  to  be  combined  which  is  also 
independent  of  the  history  of  the  process. 

It  is  easy  to  determine  the  transition  functions  which  are 
incidently , memters  of  the  incidence  algebra  I (1!^)  , and  invertible. 

Theorem  3B:  The  transition  function  is  given  by, 


P j (o  , tt  ) 


0 Otherwise;  j = l ,2 ; 

where  h (o,tt)  is  the  number  of  edges  connecting  the  pair  of  blocks 
of  a which  is  one  part  of  v when  n covers  o and  zero  otherwise. 


h ( j , n ) when  tt  covers  a. 

.n.-(j-l) 

'2' 

1 - Z h (a^n)  when  o=v  , f . -1  (n)  ^0 

0*V<S>  - (j-iii  3 


1 When  Ti=a,  f (tt)  =0 
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Proof  : The  formulae  for  p_.  ( • , are  obvious  since  at  each  stage 

each  unselected  edge  is  equally  likely  to  be  chosen. 

Theoretically  theorem  3A  and  3B  can  be  used  to  answer  all 
questions  about  the  process  since  we  know  the  initial  conditions 
and  the  transition  functions.  We  will  use  these  observations  in 
the  next  section  to  compute  various  combinatorial  functions.  We 
can  use  the  Mobius-Rota  Inversion  formula  to  obtain  a closed  form 
solution  for  the  probability  of  being  in  any  given  state  at  any 
given  time  since  the  distribution  function  is  easy  to  compute. 

Tirst,  however,  it  is  interesting  to  obtain  a new  formula  in  the 

form  of  the  transition  functions  for  the  number  of  trees  on  n-vertices. 

Theorem  3C:  (Chapman-Kolmogorof f Ecruations)  . The  following 

formulae  hold; 


fj(  ) <■  P^(0»~j)p2^  ^ ) 

~ < - 

- 1 2—’’’—  j-1-  ' 

” £ II  n ; j = l,2,...,("). 


Corollary ; If  T^  is  the  number  of  trees  on  n vertices  then 


T 

n 


1'  2 


) . 


Pn-1 


n 2 f 


1 2- 


— n— 
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Let  Fj  (")  be  the  distribution  function  of  X.;  j=l,2...(n) 

D 2 

i.e.  F. (r)  = Zf.(o)  or  the  probability  that  the  partition  of 

a<  7 

connected  components  is  a refinement  of  the  partition  n after 
j-edgcs  are  introduced. 


Theorem  3D : 


If  7 is  of  type  (k  ,k„,...,k  ) then 

is  n 


Proof : 


r.  ( 7t  ) 


The  j-edges  are  to  be  chosen  from  within  the  parts  of  - . 


Since  F_.(*)  is  known  fj(*)  can  be  computed  by  the  Mobius- 
Rota  inversion  formula. 


Corollary  1;  The  probability  that  the  graph  is  connected  after  j 
edges  have  been  added  is; 


Zk . =k 
r 

Zik .=n 

l 
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Proof:  By  Theorem  2A 


V"  \E  E 


: ( 0 ) U ( O , I ) 


k=l  r (a) =n-k 
n 


£ ^ B(n;k1,...,kn)F. (0)u(j#I) 

ii  it  . *•' 


k=l  (k. , . . .k  ) 
1 n 

Ek  .=k 

Eik . =n 
1 


where  c is  of  type  kn)  and  B (n; ^ ,k2 , . . . kn)  is  the  number 

of  partitions  of  type  (k^...^)  with  k parts.  It  is  easy  to  see  that, 

B ( n ; k . k ) =— — i nj 

n ^'1*^2*  ***k n*  k 

" 1 1 ( 2 ! ) 2 . . . ( n ! ) n 


So  that 


n 


V11  * £ 

k=l 


(k. 


z 


-,k  ) 
' n 


:ki  = * 
Zik --n 


Since  u(  > I)  (-1)  (k-1) I when  [j,I]  is  isomorphic  to  n the 

k 

result  is  proved. 
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I V . DE SCRJ  I’TIVE  COMBINATORIAL  QUANTITIES 

Wc  wl51  :i°w  derive  formulae  and  generating  functions  for  the 
intoi related  combinatorial  quantities  of  interest  in  the  evaluation 
of  algorithms.  We  will  count  sequences  rather  than  deal  with 
additional  symbols  for  the  associated  probabilities.  Obviously, 
each  of  the  foimulae  are  easily  converted  to  probabilities  and 
moments  can  be  computed.  We  will  treat  n,  the  number  of  vertices 
as  a parameter  since  it  will  be  helpful  in  the  next  section  where 
we  give  a relatively  simple  way  of  computing  mean-time  to  a given 
component  structure.  The  basic  functions  are  : 

A*  hn  (1T;j)=  the  number  of  edges  that  can  be  added 

to  a graph  with  j-edges  on  n-vertices  with  comoonent 
structure  ti  so  that  the  component  structure  stays 
at  tt. 

B.  p j ( o , r ) = the  number  of  ways  of  adding  an  edge  to  a 

graph  G with  (j-1)  edges  and  component  structure  o 
to  change  the  component  structure  to  tt  . 

c-  fn(7r»j)  = the  number  of  sequences  of  j-edges  on 

n— vertices  which  determine  the  connected  component 
structure  tt. 

Cn  ^ ^ = thG  numt>or  ofr  sequences  of  j-edges  on 

n-vertices  which  "enter''  tt  lor  the  first  time 
with  the  addition  of  the  jth  edge. 
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E*  V '=>'k)  = number  of  sequences  of  k-edges 
which  enter  - for  the  first  time  on  the  Dth  edge 
and  leaves  - on  the  kth  edge.  This  quantity  is 
called  the  conditional  sojourn  time  in  v. 

(i  j}  wherertltir  covers  a Partition  tt  then  the  pair  of  numbers 

' ”her°  1 3nd  ].are  the  of  the  parts  of  a which  are 

to  form  IS  called  the  cover  type  pair  of  (v,o). 

Lemma  4A;  The  quantity  p.  ( a ,v)  is  given  by 


F. 


Pj  (0,tt)  = 


0 if  Qji  ~ and  f n ( CT  , j - 1 ) = 0, 

1 if-="  and  fn(c,j-l)  = o, 

hn  (-,j-l)  if  a=r  and  fn  (r,j-l)>o, 

i k if  (i,k)  is  the  pair  cover  tvpe 
of  (o,tt)  , 


yO  otherwise. 


Proof 


The  proof  is  immediate,  actually,  this  i 


Theorem  3B,  restricted  for  completness. 


s a repeat  of 


Lemma  4B: 


The  quantity  h^  (-r,j)  i 


s given  by 


n 


when  a is  of  type  (k^k., kn)  . 
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Proof : Obvious. 

1 omm.'i  40 : The  Quantity  C ( r , j ) is  qivon  by, 

h.  c ( ■ , j ) = : f ( , j - 1 ) i ' i , ) . 

ii  n I 

Proof : I mmed i a t e . 

Lemma  4D;  The  quantity  S ( ;j,k)  is  qiven  by. 

Pk  ( it  , o ) . 


I.  Sn  (• ; j , k ) = 


Cn 


k-j-1 

11  «n(  ,j  + ’ 

r =1  n 


Coro  1 lary : The  number  of  sequences  which  spend  k-units  of  time 

in  T’  denoted  by  s ( ,k)  is  qiven  by, 

( 2 ) - k 

J*  s ( ,k)  - l S ( ; j , j + k)  ; k = 0 , 1 , 2 

1 = 1 

The  proofs  of  all  of  the  above  lemmas  follow  from  the  Markovian 
property  of  the  process  Xj , *2 , . . . , X n . All  of  the  quantities  are 
computable  from  the  above  equations  2 and  the  "closed  form"  equation 
for  fn  (^,j)  as  given  in  the  previous  section.  However,  the  calcula- 
tions are  cumbersome  and  can  be  done  more  directly,  as  we  will  do 
in  the  next  section.  We  close  this  section  by  obtaining  the  generating 
function  for  (r,j)  and  hence,  for  all  the  quantities  given  above. 


Theorem  4A: 


If  F 

n 


(y) 


l 

j=i 


f 

n 


(I,  j) 


then , 
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n 


K*  Fn(y)  = - nl  y 


(k-i ) : 


lhwYi 


k=l  (klf...,kn) 


t'  1 w ' v ' — ■ 

1 2 n i=1\ 


T.k . =k 

l 


Zik^=n 


Where  B^y)  = (1+y)  2 , (This  formula  is  related  to  Faa  Di  Bruno's 
formula  for  derivatives  and  the  classic  Bell  polynomials.) 


Proof ; From  the  result  of  Section  3, 


n 


f n ( I / j ) = n! 


k-1 


l l 

k=l  (k1...,k  ) 


(-1) 


(k-1) ! 


lEki<2> 


i 


k ' k ' . . ,k  ! 
i 2.  n 


Eki=k 


k.  k_  k 

1(2!)  . . . (nl ) n 


Zik^=n 


Thus 


n 


Fn(y)  =n ! I (k-1) ! (-1) ! 


k-1 


k=l 


(k^ ; . . . r k^ 


l 

j>l 


Ski  (|)  ]y3 


k.  k«  rw 

1 (2! ) . . . (n!  ) n 


Ek  .=k 
i 


Zik . =n 

l 


n 

n ! Z l 

k=l  (k,,...k  ) 
l n 


Zk^=k 


k,  k_  k 

1 (2!)  * . . . (n ! ) n 


(k-1)!  (-1) 


k-1 


£ik^=n 


«.  17 


~ — - 'J-*  L J-J‘  — 


■ -■  - -■ - 


- 


th‘tuui‘k  Amlyvic  Corj.oi-ation 


and  the  result  follows  by  expanding 


(l+y) 


kj(l)  4 k ^ ( 2 , ♦ ..*kn(^) 


(1  m/) 


(l+y) 


2 ^2 

r2) 


d+y) 


k 

<2>  0 


V . CALCULATIONS  ; 

We  can  determine  pairs  of  equations  from  which  we  can  re- 
cursively compute  fn  (I,j)  from  cn  (I , j ) and  conversely.  These 

equations  can  be  derived  from  the  Markovian  property  or  from  the 
fundamental  isomorphic  decomposition  of  I! 

n ‘ 

Theorem  5A;  (Basic  Recurrence) 


If  -r  is  of  type  ,k2  , . . . ;kr  ) then 
fn  ' IZ  M(w2,1'W2,2 "2,k2:w3,l'w3,2 w3,k; 


' 1 ' ">/•••  / W^  ) 


n , 1 n , 2 n , k 


n 


d.w2<j>  *mt3  U;w3t)  ...  K%n  (I,wn_u). 


where  M (j,w21,w2_2 “2,k  'w3,iw3,2 3,k 

Wn , 1 ,wn, 2 ' ' • ' wn  k ] 1S  the  ^uLtinomial  coefficient  and 

n n 

the  sum  is  over  al]  partitions  of  the  integer  j into  l k parts. 

x=2 


6.18 


Network  Analysis  Corporation 


Proof:  If  tt  is  a partition  of  type  (J^  ,k2 , . . . ,kn)  then  if  a 

sequence  of  j edges  produces  tt  the  sequence  must  contain  places 
for  the  kv,T  edges  which  come  from  the  Tth  part  of  those  parts 

which  have  v-vertices . The  set  of  subscripts  {l,2,...,j}  can  be 
partitioned  in  such  parts.  The  edges  in  the  part  with 

Wv,t  ed9es  can  then  be  arranged  in  fy  (I,wv,T)  ways.  The  product 

formulation  follows  from  the  basic  isomorphism  theorem. 

Corollary  1:  If  tt  is  a dual  atom  of  type  (r,n-r)  (i.e.  one  part 

with  r-elements  the  other  with  (n-r)  elements)  then 

j 

fn  (7T'S;  = wIi  f r (I'w)*  fn-r  (I'3-W) 

when  n>2 , r>l,  (n-r) >1; 

fn  ~ (I;j)  when  r=l  or  n-l=r; 

f2  = 1,  f2  (I , j ) =0  j>l 


Corollary  2:  [5],  [3],  [16). 


If  T(n)  is  the  number  of  trees  on  n-vertices  then, 
1 n-l 

= 2 (n-l)  Q T ^i)  * (n-i) *i* (n-i) 


Proof : 


If  we  set  j=n-2  in  Corollary  1 then, 


n-2  /n-2\ 

w=l  U / £r  (I'“’  fn-r(I'n-2-W)- 
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This  expression  has  one  non-zero 
f 


:n  (T,'n-2»  -("^)£r<I-r*1> 


n-r 


Since  obviously  fr(I,r-l)=  T(r)  • 


fn(TT,n-2)  = (n-2 ) !T  (r ) • T (n-r ) . 


term  so 
(I,  r-2 
(r-1)  ! 


- (r-1)). 
we  obtain 


Since  tt  has  two  parts  we  get  a tree  on  n-vertices  by  a single  edge 
connecting  those  parts.  Summing  over  all  such  partitions  then 
yields  all  trees  on  n-vertices. 


Theorem  5B:  The  quantity  Cn  (I,j)  is  given  by 

Cn  (I,j)  = l fn  (tt  , j-1)  h (u,I)  n> 2 

TT  is 

a dual  atom 

C1  (I,j)  =0  Vj; 

C2  (1,1)  =1  C2  (I,j)  =0  j>l. 


Proof : The  proof  follows  by  the  Markovian  property.  If  we  are 
to  enter  I for  the  first  time  on  the  j th  then  after  (j-1)  we  must 
be  in  a dual  atom. 


Corollary 


C (I,j)  = 1 
2 


n-2 


1 O f n (Trr'r,  j“l)  (n-r) 
r_ 2 ■L  1 L n-r 


+ n (n-1)  *fn  (7T1,n_1;  j-1) 
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(The  notation  means  n is  a dual  atom  of  type  (i,j).) 

Suostitution  of  the  formula  of  Corollary  1 to  Theorem 


Proof 


5A  into  Theorem  5B 


— ■ — r-6-m  — ! The  entity  c (I , j , can  be  obtained  from  f , 

f ncino  n-1 


n-2 ' ' ‘ * using 


, n-2 

2 E.  £> 
r=2 


j-1 


( w ),fr  (I'w),fn-r  (I 0-l-w)  • r • (n-r) 

+n  (n-i) •fn_1  (I; j-1) . 

Proof:  Combine  Theorem  5A  and  5B 

Theorem  5D:  The  quantity  fn  (i,j)  can  be  computed  from  C (i,j) 

by  the  equation  n 

f"  (I'j)  ‘ J-1  Cn  ,I'k’  (<?)-:)• 

P£Oof:  This  follows  from  the  Markovian  property.  If  we  are  to 

be  instate  I after  j steps  then  we  arrived  for  the  first  time  on 
the  k step  and  stayed  in  I. 


Theorems  5C  and  5D  provide  a coupled  pair  of  equations  to  comoute 

Cn  lrom  £n-l  and  then  fn  from  Cn>...  and  so  forth.  Some  samola 
calculations  are,  fj  (1,1)  = 1,  and  f2  (I,  j)  = 0 ^ 


We  compute  C3  from  f ; 


C3  (1,1) 
C3  (1,2) 

c3  (I, j) 


0 

5 (empty  sum)  + 3-2«f2  (I;l)  = 6. 
0 j >2 . 
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Now  we  compute  from  C3 
2 

f (1,2)  - 5:  C,  (I,k)  = C,  (1,2)  = 6 

k=2 

3 

f,  (1.3)  = X C,  (I,k)*(3-k)  = C _ ( 1 , 2 ) • i=  6 
k = 2 3 3 


fr  (f;w)  • fn_r  (I,j-l-w)  • r ( 4-r ) 

• f2  ( I ; 2-w)  *2*2  + 4*3*6 

= 12  [ 2 * f 2 ( I ; 1 ) * f2(I,l)  + 1 - f 2 ( I ; 2 ) *f2 (1,0) ] + 72 

= 12  [2*1+0]  + 72  = 96 
2 3 

C4(I,4)  = j £ (2)  £ (^)  f (I,w)  • f (I;3-w)*  r*(4-r) 

r=2  ^ w=l  w r n-r 

+4 • 3 * f 3 (1,3) 

= 12  [3*0  + 3*0]  +72  = 72 

C4(I,j)  - 0;  j>5. 


f 3 ( I . j ) = 0,  j >3 . 

We  can  compute  C 4 from  f 

1 2 4 2 p 

V1'3'  -7*  <2’  E,  <w> 

r=2  w=l 


+ 4 • 3 • f 3 ( I ; 2 ) 


= | E (3)  f2  U;w> 

W—  1 


Similarly,  we  can  compute  f4  from  C4  and  obtain  f 4 (1,3)  = 96, 
f4(l/4)  = 360  , f4(I,5)  = 720  and  f 4 ( I ; j ) = 0,  j>5.  The  coupled 
equations  themselve  can  be  used  to  obtain  the  exponential  generating 
functions  for  the  sequences  f^,  and  C^.  These  generating  functions 
turn  out  to  be  exponential  convolutions  and  have  been  recentlv 
studied  [17]  . 
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It  is  possible  to  compute  f . (I)  from  the  formula  of  corollary 

1 but  the  calculation  is  lengthy.  An  example  follows: 


Example  (n  4).  Fcr  n-4  T^-16,  hence  the  probability  of  a 

tree  should  be  16  = 4 when  j=3. 

20  5 


Parts  (k) 
1 
2 
2 

3 

4 


Class  or  Type 
(0,0,0,15 
(1,0, 1,0) 
(0,2, 0,0) 

(2, 1,0,0) 

(4, 0,0,0) 


6 

3 

2 

1 

0 


20 

1 

0 

0 

0 


I 3 ( I ) 


24 

20 

24 

96 

20 

4! 

20 

(24)  (6) 
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VI . A NEW  MEASURE  OF  CONNECTIVITY 

Let  G be  a connected  qraph  with  n nodes  and  m edqes.  Let 

S(G)  be  the  set  of  th  in!  permutations  ot  edqes,  viewed  as  a set 

of  mi-tuples.  Foi  each  sequence  s S(G)  do f i no  C^fs)  as  the  index 

of  the  first  edeje  in  s for  which  the  qraph  with  the  first  (s) 

edges  is  connected.  The  number  m-Cc(s)  then  measure  "how  long" 

the  sequence  s has  been  a connected  qraph.  Intuitively,  if  many 

of  the  sequences  in  S(G)  have  large  m-Cc(s)  then  the  qraph  G is 

more  connected.  In  particular  we  take  the  averaqe  or  first  moment 

of  numbers  \ m-C  (s)l  as  the  definition  of  the  connectivity  of  G. 
c 

We  can  also  take  higher  moments  which  would  lead  to  more  precise 
measures  of  connectivity . 

Definition : The  connectivity  or  "mean  connectivity " of  a 

connected  graph  G,  denoted  by  C(G),  is  given  by: 

(MG)  = j^r  7 (m-C  (s))  = m - l C (s) 

■ stS(G)  c seS(G)  c' 

Alternatively,  if  C^tk)  is  the  number  of  sequences  in  S(G) 

for  which  C (s)  = k then, 
c 

C(G)  - m - ij-  ? k C (k) 
k=n-l 

Example:  If  G is  a tree  on  n nodes  then  m=n-l  and  C^fsJ^n-l 

for  all  srG(G),  so  that 

C (G)  = ( n-1 ) - (n_i)  |(n~l)  ( n- 1 ) ! = (n-1)  - (n-1)  = 0. 
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Thus,  all  these  have  connectivity  zero. 


Example:  Consider  a triangle: 


II 

1 — 1 
tn 

{1,2} 

, {1,3}, 

{2,3}  , C(s1) 

= 2 

S2  - 

{1,2} 

, (2,3}, 

1,3}  , C ( s 2 ) 

= 2 

S3  " 

{1,3} 

, {2,3}, 

1,2}  , C(s3) 

= 2 

S4  = 

{1,3} 

, il,2}, 

1 2 , 3 1 , C { s^ ) 

= 2 

S5  = 

12,3} 

, {1,3}, 

{1,2}  , C(s5) 

= 2 

S6  = 

(2,3} 

, {1,2}, 

{1,3},  C{s6) 

= 2 

C (G ) 

= 3 - 

I Z2  = 

6 zr 
s^S(G) 

3 - | [12]  = 3 

- 2 

So  that  the  connectivity  of  the  complete  graph  K3  is  C (K  ) = 1 
1 he  connectivity  of  K2  is  zero  since  K2  is  a tree. 

Problem  1:  Determine  the  sequence  C(K2),  c (K  ),...;  or 

its  generating  function;  i.e.,  ^ 

* C (K.) 

r 1 1 \ - t i j. 


We  now  turn  our  attention  to  disconnected  graphs.  Let  G be 
a disconnected  graph  and  let  S(G)  be  the  set  of  all  permutations 


6.25 


Network  Analysis  Corporation 


of  edges  not  in  G.  For  each  seS(G)  let  C^(s)  be  the  index  of  the 
first  edge  in  the  sequence  s for  which  G would  become  connected 
if  all  the  preceeding  edges  were  added  to  G. 

Definition;  If  G is  a disconnected  graph  with  m-edges  we 
define  its  connectivity, 

C (G ) = - E — d = - — I Cd(s* 

seS(G)  ( (^ ) -m)  ) ! ( <2 ) -m!  seS(G) 

Example : The  empty  graph  Q ^ on  two  nodes,  has  m=0  so  C(s)=l. 

Therefore , 

C (fl2)  = - — -i = -1 

Example : We  compute  the  connectivity  of  There  are  six 

sequence  in  S(G)  and  C(s)=2. 

Therefore , 

C (Q,)  = - i E 2 = - M = -2 

J seS(G)  6 

The  computation  of  the  sequence  ^c(fin)^=2  or  its  9Gnerating 
function  is  carried  out  in  the  next  sections. 

Example ; We  compute  the  connectivity  of  the  graph  G depicted 


below: 
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The  graph  G is  disconnected.  The  edges  not  in  G are  {1,3}, 

2,3),  there  are  two  permutations,  and  C(s)  = 1 for  both  so  that, 

c (cj  = - ;;  if-  = - i (l  + i)  = - l 

seS(G) 

Many  questions,  problems  and  conjectures  are  already  apparent, 
some  of  these  are: 

Problem  2:  There  seems  to  be  a duality  in  terms  of  connectivity 

between  a graph  and  its  complement.  Can  the  connectivity  cf  a 
graph  be  obtained  in  terms  of  its  complement?  The  answer  xs  yes 
and  the  appropriate  result  is  given  in  Section  VII. 

Problem  3:  Is  the  connectivity  of  a graph  related  to  the  re- 
liability of  its  network?  The  answer  is  yes  and  this  is  discussed 
at  an  elementary  level  in  Section  VIII. 

Problem  4:  If  we  are  given  a graph  G,  then  all  sequences 

seS(G)  have  "connectivity  weights"  C(s).  What  are  the  mjnimal  and 
maximal  values  of  C(s)  over  all  seS(G).  In  statistical  terms 

we  are  asking  for  the  range  of  C(s)  over  S(G);  i.e.  max  C(s)  - min  C(s), 

ssS(G)  seS(G) 

Along  these  same  lines  given  a class  G of  graphs  what  are  the 
minimal  and  maximal  values  of  the  connectivities  of  the  graphs  in 
G?  Characterize  the  classes  of  graphs  which  achieve  these  values. 

This  problem  is  discussed  in  Section  IX. 

Problem  5:  Can  the  connectivity  of  a graph  be  determined  by 

the  connectivity  of  various  bipartite  graphs.  What  is  the  connec- 
tivity of  a bipartite  graph  consisting  of  two  connected  components? 

This  Question  is  discussed  in  Section  X. 
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PLobiem  6 


There  aic  many  descriptive  combinatorial  functions 


associated  with  a graph,  minimal  cutset  size,  girth,  diameter, 

nunihor  ot  spanning  trees,  cliques,  etc.  How  are  these  related 

to  connectivity?  The  definitions  of  those  quantities  are  given  in 
Sect  1 on  X J . 


Pi  ob  loin.  7 ; ‘here  at-  many  combinatorial  questions  associated 
with  the  sequence  of  graphs  determined  by  a sequence  of  edges  in 
S(o).  Many  of  these  questions  concern  the  number  and  type  of  the 
connected  component  structure  associated  with  the  sequence  of 
graphs  associated  with  each  srS(M).  Some  of  the  questions  are 
formulated  in  Section  XI. 

■K,i  8;  Tli«j re  should  obviously  be  relationships  between 
connectivity  of  a graph  and  its  chromatic  polynomial  or  other  such 
colon  ng-  properties.  In  Section  .XI  we  give  Whitney’s  derivation  of 
the  chromatic  polynomial  of  a graph.  It  seems  that  the  Whitney 
approach  gives  a tie  in  with  connectivity  since  it  deals  with 
colorings  of  subgraphs  and  that  seems  to  be  a way  of  approaching 
connectivity. 

Problem  9:  when  is  the  connectivity  of  a graph  greater  than 

the  connectivity  of  every  subgraph? 

Problem  10:  If  two  graphs  have  the  same  number  of  edges  but 

different  size  cutsets,  does  it  follow  that  their  connectivities 
have  the  same  preuerties? 


Problem _U:  How  many  "moments"  are  required  to  uniquely 

determine  a graph  or  isomorphic  classes  of  graphs? 

Problem  12:  How  do  we  construct  "highly"  connected  graphs,  or 

build  on  existing  graphs  to  increase  connectivity  at  low  cost? 
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VII.  A GRAPH  AND  ITS  COMPLEMENT 

We  investigate  Problem  2 ana  derive  a relationship  between  a 
graph  and  its  complement. 

Definition : It  G is  a graph  then  its  complement  G'  is  the  graph 

on  the  nodes  of  G whose  edge  set  is  the  set  of  all  edges  not  in  G. 


Theorem  7A:  If  G is  a disconnected  graph  with  m edges  on  n 

nodes  then 


C (G * ) - C (G)  = (!?)-  m + — - 

^ / n 


(2J  m srS(G) 


(Cd (S)-Cc (S) 


Proof:  If  G is  disconnected  then  G'  is  connected  and  G'  has 

( ( 2 ) — rn ) edges.  Since  S(G)  and  S(G')  are  identical  as  sets  of 

( (^) -m) -tuples  of  edges  we  have  by  direct  calculation; 


C (G ’ ) - C (G)  = ( (’2‘)-m)  - 

Cd(s) 


( (2) ~m)  ! 


seS (G’ ) 


Cc(s) 


scS(G)  ((2)-m)! 


( (!?)  -m)  + 


, ,n,  . , d 

( f _ ) -n)  ! _ „ . 


[ ZC  , (s)  - (s) . ] 


slS (G)  s S (G ' ) 


( ( t) ~m) + ~ 


( {2)  ! stS  (G) 


[Cd (s)  - Cc (s) ] 


Corollary  1:  For  the  complete  and  empty  graphs; 


C(Kn)  + C (fln)  = (")  . 
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Proof : Set  G =fin  in  Theorem  2.1  and  note  that  C^(s) 

= C (s)  V seS (n  ) . 
^ n 

Corcllary  2:  For  the  generating  functions  C,  (z)  and  Cr  (z)  of 

K 

{ C ( ) } and  {C(n^)},  respectively  we  have 

2 Z 

ck(z)  + cn(z)  = ^T“ 


Proof : By  direct  calculation! 
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VI11*  APPLICATION  to  reliability  and  other  areas 

The  analysis  of  sequence  sets  such  as  S(G)  arises  in  many  areas 
other  than  in  a "pure"  study  of  graph  connectivity.  In  set-merging 
alqor ithms  we  have  a partition  of  a set  of  n points  which  we  may 
think  of  as  nodes  of  a graph.  Unordered  pairs  of  points  are 
presented  sequentially  which  we  may  think  of  as  edges  of  the  graph. 
If  the  two  points  of  an  edges  are  in  different  parts  of  the 
partition  then  the  two  parts  are  merged  (their  set  theoretic  union 
is  formed).  Otherwise,  the  same  analysis  is  carried  out  with  the 
next  edge.  The  obvious  questions  concern  the  statistical  moments 
of  the  number  of  edges  need  to  connect  the  graph  (merge  all  parts 
into  one  part)  or  to  reach  some  other  "state".  Many  of  these 
questions  can  be  restated  in  terms  of  the  statistics  of  S(G). 

Another  area  in  which  the  set  S(G)  arises  is  in  the  use  of 
random  edges"  to  generate  a spanning  tree  for  a graph  or  network 
with  n-nodes.  Edges  are  randomly  generated  and  added  to  a graph 
(beginning  perhaps  with  the  empty  graph).  The  question  arises  in 
the  analysis  of  such  a process  as  to  how  long  it  will  take  before 
the  resulting  "random  graph"  is  connected.  The  answer  to  this 
question  if  we  start  with  an  empty  graph  on  n nodes  is  C (ft  ).  it 

should  also  be  clear  that  this  question  is  the  same  as  the^et 
merging  c lestion. 

A large  area  of  potential  application  of  the  notion  of  connec- 
tivity is  the  study  of  the  reliability  of  networks.  Let  G be  a 
graph  and  Rp(G)  be  the  probability  that  the  graph  G is  connected  if 
p is  the  probability  that  is  branch  (independently)  is  "on"  or 
"working"  or  "up".  Clearly  if  G has  m edges. 

m 

R0(G)  = Z A,pk(l-p)ra-k 

‘ k=n-l  K 

where  AR  is  the  number  of  connected  subgraphs  of  G with  n nodes  and 
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k edges.  Alternatively 


m 


R (G)  = 1 - I Biq1pm"1' 
i =w 


where  bi  is  the  number  of  disconnected  subgraph  of  G with  n nodes 
and  (m-i)  edges.  W is  the  size  of  a minimal  cutset  of  G. 


A simple  revision  for  R (G)  relates  the  reliability  of  graph 


or  networks  to  subgraphs.  This  might  be  helpful  in  relating  re- 
liability to  connectivity  and  to  coloring.  Let  e be  a given  edge 
in  G.  so  that: 


R (G) 
P 


P{G  is  connected  | e is  open}  P{e  is  open} 

+ P{G  is  connected  | e is  closed}  Pfeis  closed} 


= P{G  is  connected  | e is  open}  + q R^(g’) 


where  G*  is  the  subgraph  of  G obtained  by  removal  of  e.  Let  G’1 
be  the  graph  obtained  by  identifying  the  nodes  incident  to  e.  Thus 


Rp  ( G ) = pRp  (G ' ' ) + aRp(G'  ) 


where 


R (G'  ' ) >R(G)  > R (G'  ) 

t-*  r*  Jr 


The  same  type  of  formula  can  be  obtained  by  taking  a subset  of 
edges  of  G which  is  not  a cutset.  This  would  help  tie  in  with 
coloring  problems. 

The  Moore-Shannon  definition  of  reliability  is  similar  although 
it  asks  for  the  probability  p^  that  a given  pair  of  nodes  {i,j} 
is  connected  by  some  path  in  G.  Clearly: 


m 


ID 


Z A pk(l-p)m_k 
k=l  K 
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Where  is  the  number  of  distinct  k edge  subgraphs  of  G which 
contain  a path  between  nodes  i and  j,  and  m is  the  number  of  edges. 
Alternatively,  if  B^  is  the  number  of  k edge  subset  of  edges  which 
when  removed  leaves  nodes  i and  j disconnected  then, 
m k m-k 

1-Pii  = Z Bkd-P)  P 
k=l  K 

It  is  easy  to  relate  R (G)  to  C (G)  and  it  should  be  easy  to 
_ ^ 

relate  P^j  to  C (G) . Let  G be  a graph  on  n nodes  with  m edges. 

Let  f G (k)  be  the  number  of  sequences  in  S(G)  for  which  the  first 
k edges  on  the  n nodes  form  a connected  graph. 


Theorem  8A:  Let  Ew(p)  be  the  expected  value  of  X!  when  X is  a 

binomial  random  variable  b(w;p).  Let  C,.  (p)  = CG(k)pk,  then 
m k ‘ k • 


RP(G)  * k=0  Em-k(P>  Ck'P> 


Proof : By  definition: 

m 

R (G)  = Z . k 
P k=n-l  Akp  (l-p)m_k 

k 


t„  (k) 


Now  Ak  = 


k w=n-l  CG(W)  (m-w)(k-w) 


where  (t)  - t (t-1)...  (t-k+1)  is  the  falling  factorial, 

It  follows  by  substitution  that 


m 


R (G)  = z 


Z 1 Z C (w) (m-v)  p (1-p) 

=n-l  k!  w=n-l  b (k_v;) 


m-k 


Rearranging  terms  and  collecting  by  CG (w)  yields, 

m m-k 

<VG>  * £ C (k)pk  l (m-k),.  p3  (l-p)m_3‘k 

p k=n-l-|1-  j=0  (3) 


which  proves  the  theorem! 
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Problem  13;  Devise  an  algorithm  for  computing  C^(k).  Perhaps 
a Markovian  type  algorithm  as  used  in  [22]  for  the  complete  graph 
might  work. 

Problem  14;  Study  the  behavior  of  the  sequence  C^(p)  in  terms 
of  the  connectivity  of  connected  graphs.  In  fact  if  C'k(p)  is 
the  derivative  of  C^(p)  with  respect  to  p,  then  obviously  the 
connectivity  of  G is, 

C (G)  = [m-i  l kl  C'(p)l 

m k=n-l  K p-i‘ 

Problem  15:  Suppose  two  graphs  or  networks  with  the  same 

number  of  nodes  and  edges  have  different  reliability  functions. 

To  what  extent  can  the  differences  be  explained  by  the  "moments" 
of  connectivity? 

Problem  16:  Relate  statistical  connectivity  to  the  Moore- 

Shannon  definition  of  reliability. 
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Ix*  graphs  with  minimal  and  maximal  connectivity 


In  Problem  4,  we  asked  about  graphs  with  minimal  and  maximal 
connecti\ i ty , we  shall  examine  the  case  of  connected  graphs  with  n 
nodes  and  n edges.  The  connected  graphs  with  (n-1)  edges  all  have 
connectivity  zero.  What  is  the  minimal  connectivity  of  a graph 
with  n edges?  What  is  the  maximal  value?  Obviously  if  G has  n 
nodes  and  n edges  and  is  connected  0<C(G)<1.  More  generally  if 
G has  m edges  0<C  (G)jon—  (n—  1 ) . There  are  no  connected  graphs  with 
connectivity  zero  which  have  more  than  (n-1)  edges. 

The  following  classes  of  graphs  with  n nodes  and  n edges  have 
connectivity  3 ei  T\e 

n+T.  2 L\ 

e 2 3 4 5 n 

There  are  3(n— 1) ! sequences  of  the  n edgos  whose  associated  graph 
becomes  connected  for  the  first  time  on  the  (n-l)st  edge,  and  (n-3) 
(n-1)!  sequences  which  become  connected  graphs  for  the  first  time 
with  the  nth  edge.  Therefore 


C (G)  = m-l_  Z kC„(k)  = n-1  l k C„  (k) 
m k=n-l  n!  k=n-l  G 


= n-1  [ (n-1)  (n-1) ! • 3 + n (n-3)  (n-1) ! ] 
n ! 


= n-3 (n-1)  - (n-3)  = 3 - 3 (n-1)  = 3 


Another  grapn  with  the  same  connectivity  is, 


e 2 


el 

n n-1 


1 e4  4 e5  5 


n-1  n 
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It  should  be  easy  to  prove  the  following  conjectures. 

Conjecture  1:  Among  the  class  of  connected  graphs  with  n 
nodes  and  n edges  the  minimum  value  of  the  connectivity  is  3/n  and 
is  achieved  by  those  graphs  which  contain  a triangle. 

More  generally. 

Conjecture  2:  If  a connected  graph  on  n nodes  with  n edges 

contains  a k-gon  then  its  connectivity  is  k/n. 

Conjecture  two  applies  to  the  maximum  value  of  the  connectivity. 
Indeed  the  cycle  graph  C . 


has  connectivity  one  since  the  ommission  of  any  edges  does  not  destroy 
connectivity.  Many  other  questions  are  now  apparent! 

Problem  I7:  Generalize  conjectures  1 and  2 for  graphs  with 
more  than  n edges. 

Problem  18;  Are  there  parallel  results  for  the  connectivity 
of  disconnected  graphs. 

Problem  19:  In  what  sense  are  the  reliabilities  of  the  above 

networks  minimal  and  -maxima 1 . 
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X.  CONNECTIVITY  OF  BIPARTITE  GRAPHS 


Let  G be  a graph  on  n nodes  with  two  connected  components  C^ 
and  C2  with  m^,  edges  and  V^,  V 2 nodes,  respectively. 

There  are  (2)  possible  edges  of  which  (m^  + m^)  are  already  in 
the  graph  G so  that  we  form  sequences  from  the  (")  - (m^  + m2)  edges 

not  in  the  graph  G.  Let  X be  the  random  variable  "time  to  connectivity" 
on  S(G).  There  are  (V^«V2)  good  edges  and  (^)  - (m^  + m2)  - (V^*V2) 

"bad"  edges  from  which  to  select  at  random  (form  sequences)  and  add  to  G 
one  at  a time.  Let  PR  = P {X=k},  so  that 

Vv2 

p = p{x=l}  = — - 

(2)  - (m^+m0 ) 


P2  = P { X=2 


= (1-Pi  ) ( 


VV2 


1 ' ' ,n.  . . , 

(2)“  (m1+m2)-l 


n-1  V.*V, 

Pn  = [ n (1-P .)]  (-i ± 

i=l  (2)  _ (m1+m2) - (n-1 ) 


Let  G ( z ) - Z P,  be  the  generating  function  for  the  sequence 
k=l 


;pn}‘ 


Theorem  1QA : The  function  G(z)  satisfies  the  differential 

equations ; 

(1-z)  G'(z)  + [ (1-z)  (W+l ) = (wz) ] G(z)  + w (z-w-1)  = 0, 

w 

where  w = and  W = (£)  - (m1+m2). 
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— oof.  it  is  easy  to  see  that  ?n  satisfies  the  recurrence  relation 
Pn  = pn-l  (^W- (n-if } > for  ni2  with 


P = w 

1 W. 


The  theorem  follows  by  direct  application  of  the  generating 
function  and  a little  algebra. 

The  connectivity  of  the  bipartite  graph  is  given  by  G'(l). 

Perhaps  a more  detailed  analysis  of  the  differential  equation  will 
yield  some  more  information. 

It  is  possible  to  obtain  an  upper  bound  for  the  connectivity 
of  a bipartite  graph  which  we  conjecture  to  be  exact  in  the  limit. 

Let  Y be  the  time  to  connectivity  of  the  bipartite  graph  when 
the  edges  are  chosen  with  replacement.  Obviously  E[Y]>E[X].  The 
time  to  connectivity  for  the  replacement  process  is  easy  to  com- 
pute since  the  distribution  of  y is  geometric.  In  fact; 

oo 

E[yi  = z k (l-^)  k_1 H 
k = l w w'  W* 


so  that  we  have  proved . 


Theorem  10B;  Let  G be  a bipartite  graph  on  n nodes  with 
connected  components  and  C2  with  m^nu,  edges  and  v.  ,V,  nodes 
respectively,  then,  i 


(■,)  - (m.+mn) 


C (G) > - -1 


1 2 . 


VV2 


Lroblem  20:  Obtain  an  asymptotic  estimate  in  Theorem  10E  by 

suitably  restricting  the  ranges  of  m m v , V 

■^212 
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Problem  21;  If  we  interpret  connectivity  of  a graph  in  re- 
liability terms  as  edges  qoing  out  one  at  a time,  selected  at 
randor,  what  is  a reasonable  definition  for  the  reliability  of 
a bipartite  graph?  If  terms  of  Moore-Shannon  it  might  be  the 
average  time  for  two  nodes  (one  in  each  component)  to  communicate. 
Interpret  and  calculate  "reliability  of  a bipartite  graph"  and 
relate  it  to  connec t i vity . 
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XI.  DESCRIPTIVE  COMBINATORIAL  QUANTITIES  IN  A GRAPH 

In  the  literature  of  graph  theory,  there  are  many  combinatorial 
type  quantities  which  measure  internal  structure  of  a graph.  Each 
of  the  quantities  should  be  related  to  connectivity.  We  will  not 
spell  out  the  specific  questions  since  they  should  be  obvious! 

ition : If  a graph  G has  p connected  components,  n nodes 

and  m edges  then  the  rank  of  G denoted  by  p(G)  = n-p  and  its 
cyclomatic  number  denoted  by  v(G)  = m-p(G). 

Theorem  11A:  (Known).  Let  G be  a graph  and  G ' be  a graph 

formed  from  G by  adding  a new  edge  between  node  i and  j (arbitrary) 
then, 

p(G')  = p (G)  and  v(G')  = v(G)  if  i and  j are  in  the  same 
component,  while  p(G')  = p(G.)+l,  and  v(G')  = v(G)  if  i and  j are 
in  different  components. 

Corollary:  p(G)>o  and  v(G)>0. 

Definition : The  chromatic  number  of  a graph  is  the  smallest 

number  of  colors  needed  to  color  the  graph  so  that  no  adjacent 
vertices  have  the  same  color. 

Theorem  11B:  (Konig)  A graph  is  bi-chromatic  if  and  only  if  it 

contains  no  cycles  of  odd  length. 

It  is  not  difficult  to  verify  that  the  zeta  function  is  inver- 
tible. The  inverse  of  the  zeta  function  is  called  the  Mcbius  func- 
tion and  is  denoted  by  p(.,.).  These  observations  of  t)  ■ Mobius 
Inversion  Theorem. 
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Theorem  2A:  (Mobius-Rota  Inversion  Theorem) 

If  f:P>R  is  a real  valued  function  on  a finite  partially  ordered 
set  P and  p so  that  f(x)  = 0 when  x- p and, 

q(x)  = Y.  f (y)  then  we  have, 

y£x 

f (x)  = Z g (y)  u (y  , x)  , 
y<x 

where  u ( . , . ) is  the  Mobius  function  of  the  partially  ordered  set  P. 

Some  useful  results  for  computing  the  Mobius  function  are  given 

in  [20]:  In  Particular  the  next  formula  for  the  Mobius  function 

of  II  . 
n 

Theorem  2B:  The  segment  [x,y]  is  said  to  be  of  class  (C^ ,C2 , . . . ,Cn 

when  the  lattice  fx,v]  is  isomorphic  to  the  direct  product  of 
lattices  isomorphic  to  n^,  C2  lattices  isomorphic  to 

lattices  isomorphic  to  II  . If  [x,y]  is  of  class  (c^  > • • • > cn>  then 

c,+c9+...c  -n  c,  c.  c 

y(x,y)  = (-1)  1 n (21)  J(3!)  \..((n-l)l)  n 

Corollary : u(o,l)  = (-l)n  ^((n-1)! 
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SIMULATION  OF  PACKET  COMMUNICATION  NETWORKS 
I . INTRODUCTION 

In  developing  a large  scale  communication  network,  one 
encounters  many  problems  which  cannot  be  formulated  or  solved 
analytically.  Consequently,  one  resorts  to  simulation.  In  this 
chapter,  we  outline  the  structure  of  a simulation  program  for 
packet  communication  networks;  specify  the  problems  which  can  be 
resolved  by  a simulator;  give  the  description  of  the  simulator  de- 
veloped for  the  packet  radio  network;  present  results  obtained  by 
the  simulator;  and  discuss  the  future  development  of  the  simulator 
for  the  packet  radio  network. 

In  general,  there  are  various  degrees  of  simulation  depending 
on  the  amount  of  knowledge  (or  assumed  knowledge)  about  the  system 
operation,  and  the  objectives  of  the  simulation,  i.e.,  the  problems 
to  be  resolved.  We  particularly  distinguish  in  this  chapter  between 
a simulation  for  design  and  a simulation  for  development. 

A simulator  for  design  (e.g.,  in  a design  loop)  is  developed 
when  the  operation  of  the  system  is  completely  specified,  and  the 
objective  of  the  simulator  is  to  simulate  specific  parts  of  the 
system  which  cannot  be  analytically  modeled  or  whose  solution  is 
computationally  infeasible.  The  efficiency  of  such  a simulator  is 
of  major  importance;  consequently,  one  attempts  to  avoid  simulation 
[8]  wherever  possible,  in  the  simulation  program. 

The  objectives  of  a simulation  for  development  are  much  broader. 
In  this  case,  one  has  a set  of  theoretical  (untested)  hypotheses 
which  state  one  or  several  possible  ways  for  system  operation. 
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These  include  routing  algorithms,  protocols,  etc.  Tne  objectives 
are  to  test  (verify)  the  hypotheses,  to  complete  the  specification 
of  portions  of  the  system,  to  compare  alternative  modes  of  system 
operation,  to  identify  system  bottlenecks,  and  finally,  to  deduce 
measures  of  system  efficiency  by  obtaining  estimates  of  the  major 
parameters . 
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11 • objectives  OF  SIMULATOR 

1 . °Utline  specific  problems  for  which  a simulation  approach 

solution  is  most  appropriate. 


A. 


Routing  Algorithms 


ne„,  , , 0t,:lect  1S  t0  comPare  the  efficiency  of  existing  and 

new_y  developed  routing  algorithms  in  terms  of  throughput  and  de- 

ay  on  the  one  hand,  and  storage  and  processing  requirements  of 
the  algorithms  at  the  switching  nodes  on  the  other  hand.  The 
objective  of  this  comparison  is  to  suggest  a small  set  (two  or 
three)  algorithms  for  implementation  and  further  testing  in  an 
experimental  system.  We  note  that  it  is  not  mandatory  that  a 
single  routing  algorithm  be  used  in  a network.  For  example,  in 
the  broadcast  network  that  we  discuss  later,  a simple  routing 
a gonthm  is  used  to  load  the  switching  nodes  (repeaters)  with  a 
more  sophrsticated  algorithm.  Furtheremore,  one  may  examine  the 
P ssibility  of  using  an  alternative  routing  algorithm  under  overload 
conditions  or  in  different  hierarchies  of  the  network,  if  a 
hierarchical  network  is  investigated. 


B, 


Protocols 


There  may  be  many  protocols  in  a communication  network 
considered,  depending  on  the  type  of  communicating  devices  and 
on  the  type  of  application.  The  following  protocols  are  examples- 
erminal-terminal,  terminal-switching  node,  terminal-host  computer 
witching  node-switching  node,  switching  node-host  computer, 
os  computer-host  computer.  Some  of  the  above  may  contain 
more  than  one  protocol  depending  on  the  application,  and  others  may  be 
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needed  in  a hierarchical  network.  The  objective  of  the  simulation 
is  to  test  the  efficiency  of  these  protocols  in  a dynamic  environ- 
ment, to  change  them  when  necessary,  and  to  complete  details  which 
may  be  missing. 

C • Identify  System  Bottlenecks 

One  of  the  advantages  of  the  simulator  is  that  it  enables 
one  to  observe  and  trace  the  detailed  flow  of  packets  in  the  network. 
For  example,  a specific  packet  may  be  traced,  such  as  an  information 
packet,  an  acknowledgement  packet,  or  various  priority  packets, 
from  the  origination  node  to  the  destination  node.  This  allows 
investigation  of  questions  such  as,  whether  the  bottleneck  is  at 
the  switching  nodes  or  due  to  the  limited  capacity  of  the  channel, 
given  a proposed  configuration,  and  communication  protocols.  It 
may  also  suggest  improvements  in  communication  protocols. 


Flow  Control 


The  simulator  is  the  only  tool  which  allows  testing  and 
improvement  of  theoretically  developed  flow  control  algorithms. 

E.  Software  Transfer 

A simulator  can  be  coded  so  that  subroutines  or  sections 
of  code  are  identified  with  specific  software  programs  of  the  switching 
nodes.  This  will  reduce  the  effort  of  software  development  by 
transfering  or  coding  according  to  the  programs  in  the  simulator. 
Furthermore,  it  will  allow  testing  of  sections  of  the  software 
of  communication  devices  by  comparing  these  with  the  corresponding 
sections  in  a simulated  device. 
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F • Trade-Offs 

Am°n9  the  trade-offs  which  can  be  investigated  are: 

anl  the  7"  7"  re<Uirements  -itching  nodes 

offered  t ft  °aPaCity  °*  UnkS'  9iV6n  the  t0p0l° « the 

delay  and^T  rateS’  betWee"  av-age  delay,  maximum 

delay,  and  delay  as  a function  of  priority  and  link  and  switching 
node  capacities.  y 

G • Stand-by  Network 


The  simulator  can  be  useful  beyond  the  stage  of  develop- 
ment. It  can  be  used  to  study  particular  problems  which  mav  be 
encountered  in  the  network,  once  it  becomes  operational,  a^  to 
test  suggestions  for  improvement. 
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III.  GENERAL  STRUCTURE  OF  SIMULATOR 

It  is  important  to  separate  between  data  structure  and  manage- 
ment functions  on  the  one  hand,  and  the  communication  and  device 
functions  on  the  other  hand.  There  are  several  advantages  in  doing 
so;  firstly,  one  communication  device  does  not  have  access  to  in- 
formation available  in  other  devices;  and  secondly,  it  is  easier  to 
identify  and  distinguish  the  communications  part  of  the  program 
for  the  purpose  of  modification  or  software  transfer.  We  have 
developed  efficient  data  structures  which  can  be  used  for  simulation 
programs  of  communication  systems.  The  proposed  data  structures 
are  described  in  Section  V,  and  includes  the  fourteen  (14)  sub- 
routines of  Section  VIII,  subsection  A. 

It  is  useful  to  have  one  main  subroutine  for  each  type  of 
communication  device.  For  example,  one  subroutine  for  all  switching 
nodes  of  the  same  type.  The  differences  between  the  devices  (e.g. 
switching  node)  can  be  recorded  in  a state  vector  associated  with 
each  device.  The  state  vector  will  include  information  such  as, 
the  switching  nodes  to  which  this  device  has  channels  and  the  data 
rates  of  these  channels,  the  state  of  occupancy  of  the  storage 
buffers  of  the  node,  the  routing  algorithm  that  this  node  is 
currently  using,  and  others.  In  addition  there  will  be  buffers 
associated  with  each  device  in  which  the  content  of  specific 
packets  (e.g.  packet  type,  priority)  will  be  stored. 

There  are  distinguisable  functions  which  may  be  used  by  more 
than  one  type  of  device  (e.g.  a modem) , these  functions  can  be 
coded  in  separate  subroutines. 
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Performance  Measures 


There  should  be  an  extensive  measurement  program  to  enable 
t e resolution  of  the  problems  outlined  in  the  objectives.  The 
easurements  are  divided  into  network  performance  measurements, 
performance  of  communication  devices,  and  trace  information. 

B.  Network  Measurements 


Throughputs 

We  distinguish  between  throughput  of  packets  and  throughput 
of  information.  Packets  which  successfully  travel  from  origin 
to  destination  contribute  to  the  packet  throughput  measure. 
Packets  which  are  also  acknowledged  contribute  to  the  informa- 
tion throughput  measure.  The  distinction  comes  from  the  fact 
that  when  protocols  are  not  efficient  or  when  delays  are  very 
large,  the  origination  node  may  reissue  another  copy  of  a 
previously  transmitted  packet.  When  the  network  delivers  both 
packets,  then  the  two  packets  contribute  to  the  packet  throughput 
measure  but  in  terms  of  information  transferee  only  one  packet 
was  delivered.  Network  throughputs  are  measured  as  a function 
of  time.  This  enables  one  to  determine  whether  the  network 
can  maintain  a steady  state  throughput  when  offered  a given 
traffic  rate,  to  estimate  the  time  needed  to  obtain  steady 
state,  and  to  observe  the  behavior  of  the  network  under  either 
minor  or  major  perturbations. 


The  Average  Number  of  Links  That  A Packet  Traverses 

This  measure  when  compared  with  the  average  number  of 
links  of  the  input  (offered)  rate  for  a given  topology,  re- 
flects on  the  amount  of  alternate  routing.  It  may  also  enable 
one  to  detect  looping  in  the  network. 
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• Delays 

Many  delays  should  be  .measured  in  the  network,  we  outline 
some  of  these : 

1.  Delay  to  negotiate  protocol  between  a 
terminal  and  a switching  node. 

2.  Delay  to  negotiate  protocol  and  reserve 
storage  at  destination  switching  node. 

3.  Delay  of  an  information  packet  of  a 
given  priority  from  origination  to  destination 
node . 

4.  Delay  for  receiving  an  end-to-end  acknowledgment 
back  at  the  origination  node. 

5.  Delay  for  the  delivery  of  a maximum  size 

message  and  receiving  an  acknowledgment. 

6.  Delays  of  special  priority  packets. 

C.  Performance  of  Communication  Devices 

• Utilization  of  Devices 

Fraction  of  time  that  the  device  is  transmitting  or 
receiving . 

Number  of  Packets  stored  in  device  as  a function  of  time. 

Number  of  Packets  successfully  switched  and  number  of 
packets  discarded  due  to  buffer  overflow  or  other  reasons. 

D . Trace  Information 

This  information  includes  the  listing  of  significant 
commun rcation  events.  The  listing  contains  a unique  identifier  of 
the  packet,  its  type  and  priority,  its  origin  and  destination,  and 
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servatioYoVn  eV6nt'  “*  neCded  allow  a detai1^  °b- 

specific  int  7-  t0  traCe  °f  SpeCifi0  Pack«s,  and  to  follow 
interactions  between  source  and  destination. 

Rem^:  other  measurements  may  be  needed  to  evaluate  the 

fficiency  of  flow  control  algorithms. 
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IV*  PACKET  RADIO  NETWORK  SIMULATOR 


The  mam  features  which  distinguish  the  broadcast  network  from 
a point  to  point  (PTP)  packet  switching  network  such  as  the  ARPANET 
are:  (i)  devices  in  the  network  transmit  packets  by  using  a random 

access  scheme,  and  (ii)  devices  broadcast  so  that  signals  can  be 
received  by  several  devices  simultaneously. 

When  using  a random  access  scheme,  there  is  a possibility  that 
several  packets  are  simultaneously  received  by  a receiver  due  to 
independent  transmissions  of  several  devices;  in  the  event,  none 
of  these  packets  are  correctly  received,  and  the  corresponding  de- 
vices must  retransmit  their  packets.  This  implies  that,  unlike  PTP 
network,  the  probability  of  error  due  to  overlapping  packets  is 
much  higher  than  the  probability  of  error  due  to  other  causes  such 
as  Gausian  or  impulse  noise.  Furthermore,  the  probability  of  error 
varies  widely  depending  on  the  amount  of  traffic  on  the  channel. 

The  broadcast  nature  of  the  network  implies  that  there  is  correla- 
tion between  the  probabilities  of  successful  transmission  in  differ- 
ent parts  of  the  network. 

The  communication  system  simulated  contains  three  types  of 
devices,  terminals,  repeaters,  and  stations.  The  station  is  consi- 
dered as  an  interface  communication  device  of  the  broadcast  network 
to  a higher  level  network  or  to  a computer  installation.  Terminals 
are  considered  as  traffic  sources  and  traffic  sinks,  they  transmit 
packets  to  the  station  and  receive  response  packets  from  the  station; 
they  may  be  mobile.  The  basic  function  of  the  repeater  is  to  extend 
the  effective  range  of  the  terminals  and  the  station.  The  network 
simulated  contains  repeaters  and  stations  which  are  placed  at  fixed 
locations  chosen  at  random.  Terminals  originate  at  random  times  and 
are  placed  in  random  locations  on  the  plane.  A terminal  is  considered 
to  depart  from  the  system  once  it  completes  its  communication.  A 
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mere  detailed  description  of  the  system  simulated  can  be  found  in 
[4] , and  [5]  . 

The  program  was  coded  in  FORTRAN.  it  includes  a total  of  33 
subroutines,  approximately  3,000  statements.  The  compilation  time 
on  a CDC  6600  takes  approximately  20  CP  sec.  , and  the  running  time 
for  meaningful  results  takes  100-300  CP  sec.  The  storage  re- 
quirement of  the  program  is  245,000g  60  bit  words. 

The  simulator  has  already  been  used  for  improving  communication 
protocols,  and  to  answer  questions  related  to  the  trade-offs  be- 
tween device  range  and  device  interference,  and  the  trade-offs  between 
a single  and  a dual  data  rate  system. 
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V.  DATA  STRUCTURES  OF  THE  SIMULATOR 


The  global  information  for  the  Packet  Radio  Simulation  Program 
is  contained  in  five  (5)  data  structures: 

1)  Event  Structure; 

2)  Active  Message  Structure; 

3)  Active  Packet  Structure; 

4)  Repeater-Station  Structure; 

5)  Data  Collection  Tables. 

The  meaning,  configuration,  and  elements  of  these  structures  will 
be  explained  in  the  next  five  sub-sections.  In  the  last  sub-section, 
the  use  of  these  structures  in  the  context  of  the  entire  program 
will  be  indicated. 

A.  Event  Structure 


The  simulation  program  is  event  - driven.  That  is,  per- 
iodically the  Event  Structure  is  consulted  to  determine  the  time 
of  occurrence  of  the  next  event.  The  Event  Structure  also  contains 
information  telling  the  program  what  the  event  is.  Examples  of 
events  are  arrivals  of  messages  to  terminals,  transmission  of 
packets,  arrival  of  packets  at  receivers,  and  arrival  of  messages 
to  the  stations.  F s events  are  executed,  they  are  deleted  from 
the  structure;  and,  periodically,  newly-generated  events  are 
added  to  the  structure.  Since  there  are  a large  number  of  events; 
many  more  than  the  number  of  exogenous  message  arrivals,  for  example, 
the  process  of  efficiently  determining  the  next  event  is  of  vital 
importance.  To  this  end  the  event  times  are  maintained  in  a "heap." 
Corresponding  to  each  event  "i"  is  its  time  "ti,"  the  device  sub- 
routine ’d^"  to  which  it  refers  (e.g.  station,  terminal, 
repeater),  the  index  "w^"  of  the  device  in  question  (i.e. 
which  repeater,  which  station,  etc.),  a number  rep- 
resenting the  point  at  which  the  routine  is  entered,  and, 
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finally,  the  packet  number  of  the  packet  in  question.  In  addition, 
there  is  the  heap  index  vector  itself.  "h.."  points  to  the  event 
which  occupies  the  j position  in  the  heap.  A heap  is  a structure 

in  which  t.  . _<  Max  {t,2.,t,  }.  At  all  times  t,  is  the  smallest  t. 

3 “2 j+1  hl  1 

over  all  the  events.  This  structure  allows  for  quite  rapid  selec- 
tion of  the  minimum  t^  while  using  a minimal  amount  of  storage, 

[2] , [1] , [3] . In  order  to  efficiently  eliminate  old  events  and 
to  reuse  the  space  created  by  the  elimination,  a garbage  stack  of 
unused  locations  is  maintained. 

B.  Active  Message  Structure 

The  external  or  exogoneous  traffic  which  flows  into  the 
Packet  Radio  Network  ccrsists  of  messages  which  arrive  at  terminals 
or  stations  and  represent  information  that  users  wish  to  send  to  a 
location  using  the  net.  Messages  are  to  be  distinguished  from 
packets  which  carry  the  message  information  internally  in  the  net. 

In  general,  there  may  be  several  packets  in  the  network  carrying 
copies  of  the  same  message  or  of  parts  of  that  message.  Messages 
are  added  to  the  Active  Message  Structure  when  the  event  corresponding 
to  their  generation  occurs.  It  stays  on  the  list  until  the  last 
packet  containing  the  message  is  dealt  with.  Associated  with  each 
message  "i"  is  its  arrival  time  "ti ; " the  current  number  of  packets 
representing  the  message  "nV'  (When  "n.."  is  reduced  to  zero  (0)  , the 
message  is  removed  from  the  structure.);  its  length  "1.;"  "x."  and 
"y^">  the  co-ordinates  of  the  terminal  or  station  originating  the 
message;  and  other  pieces  of  data,  such  as  the  repeater  with  which 
the  terminal  communicates,  the  total  number  of  packets,  and  arrival 
time  of  message,  which  may  be  needed  for  output  statistics.  In 
order  to  eliminate  old  messages  and  add  new  ones  efficiently,  the 
messages  are  kept  in  a doubly-linked  list  structure.  There  is  a 
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list  kept  for  messages  and 
information  is  kept  in  two 
sents  whatever  follows  the 
or  empty  space;  while  "b." 
The  unit  being  followed  or 
contain  a message. 


one  for  unused  message  spaces.  This 
vectors , "f"  and  "b".  Thus,  "f  " repre 
space  in  question  ("i") , be  it  message 
represents  whatever  precedes  the  space, 
preceded  may  be  an  empty  space  or  may 
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..th  The  a°tlVe  paCkets  are  ^ a list.  Associated  with 

V uPaCket  1S  a pointer  to  i^  corresponding  message  and  severa] 
words  which  state  the  variable  part  of  the  packet  label  for  use  by 

the  routing  algorithms.  A garbage  stack  similar  to  the  one  used  for 
events  keeps  track  of  vacant  spaces  in  the  Packet  Structure. 

D*  Repeater-Station  Strnrfm-o 


The  Repeater-Station  Structure  is  basically  a list  of  the 
repeaters  and  stations,  their  locations,  their  possible  neighbors 
repeaters  and  stations  within  range  of  their  transmitters  and  re- 
ceivers) , and  their  state.  Associated  with  the  ith  station  or  re- 
peater are  "x.»  and  »y.»,  the  co-ordinates  of  the  device;  and  a 
state  vector  which,  among  others,  specifies  items  such  as  whether 

::rper  °r  station  is  busy>  free’ turnea  and 


whether  the  device  is  a station  or  a repeater,  m addition,  there 

„■  th 

repeater  or 


is  a list  of  repeaters  and  stations  adjacent  to  the  i1 
station. 


Data  Collection  Tahlps 


sufficient  statistics  for  the  evaluation  of  the  system 
performance  measures  (see  Section  VI  subsection  D)  are  kept  in  the 
data  collection  tables. 
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An  Outline  of  the  Use  of  Data  Structures 


The  global  simulation  structure  is  based  on  events  of 
packets  arriving  at  devices:  stations,  terminals,  and  repeaters. 

Figure  1 schematically  illustrates  the  program  flow.  It  is  quite 
simplified,  one  simplification  being  especially  important  and  needing 
emphasis.  The  events  of  retransmissions  and  acknowledgements  re- 
sulting from  a packet's  arriving  at  a repeater,  station,  or  terminal 
are  not  necessarily  generated  immediately,  but  may  depend  on  the 
arrival  of  packets  at  the  device  subsequent  to  the  packet  which 
gives  rise  to  the  new  events. 
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VI • THE  communication  system  simulated 
A • General  System  Description 

* Channel  and  Access  Mode 

The  communication  channel  is  shared  for  transmission  in 
both  directions,  to  the  station  or  from  the  station  to  terminals. 
The  channel  access  mode  is  the  Non-Persistent  Carrier  Sense 
Multiple  Access  (CSMA)  [7].  That  is,  when  a packet  is  ready 
for  transmission,  the  device  senses  the  channel  and  transmits 
the  entire  Racket  if  the  channel  is  idle.  If  the  channel  is 
busy,  the  device  reschedules  the  packet  for  some  future  randc.n 
time  at  which  it  senses  the  channel  again,  and  the  procedure 
repeated. 

* Capture 

A zero  capture  system  is  simulated.  That  is,  whenever 
the  reception  of  more  than  one  packet  overlap  in  time,  none 
of  the  packets  is  correctly  received. 

* Packet  Types  Simulated 

IP  - Information  Packet 

ETE  - End-to-End  Acknowledgement;  Short  Packet  (Assumed 
to  be  10%  of  the  length  of  an  information  packet) . 

SP  - Search  Packet..,  transmitted  by  Terminal 

or  Repeater  to  all  devices  and  aimed  to 
identify  a specific  receiver  (short  packet) . 


7.17 


Network  Analysis  Corporation 


RSP  - Response  to  a Search  Packet,  transmitted  by 
Repeater  or  Station  which  received  a SP  and 
is  available  for  handling  packets.  This 
packet  contains  the  label  of  the  transmitting 
device  and  is  addressed  to  all  devices. 

B.  Routing  Algorithms 

Three  routing  algorithms  which  are  implemented  in  the 
simulator  are  briefly  described  in  the  following  paragraphs.  A 
detailed  description  of  the  routing  schemes  is  given  in  [5]  . 

• Hierarchical  Labeling: 

The  hierarchical  labeling  routing  scheme  enables  point- 
to-point  routing  between  devices  along  an  "efficient  path". 

It  is  obtained  by  assigning  to  every  repeater  a label,  which 
forms,  functionally,  a hierarchical  structure.  The  label 
assigned  contains  the  following  information: 

1)  A specific  address  of  the  repeater  for 
routing  purposes. 

2)  The  minimum  number  of  hops  to  the  nearest 
station . 

3)  The  specific  address  of  all  repeaters  on  the 
shortest  path  to  the  station,  and  the  address  of 
the  repeater  to  which  a packet  has  to  be  transmitted 
when  destined  to  the  station. 

In  the  hierarchical  labeling  algorithm  an  information  packet  (IP) 
is  addressed  to  one  device.  If  it  is  received  by  a device  to 
which  it  is  not  addressed,  then  the  receiving  device  is  closer 
to  the  destination  than  the  device  to  which  the  packet  was 
addressed.  If  the  preassigned  path  is  temporarily  blocked, 
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the  packet  may  depart  from  it.  It  then  uses  the  most  efficient 
path  from  its  new  location.  The  departure  from  the  preassigned 
path  is  obtained  by  a search  procedure. 

The  Response  to  Search  Packet  (RSP)  is  transmitted  by 
repeaters  after  a random  waiting  time.  This  time  randomiza- 
tion is  essential  in  a no-capture  system.  Otherwise,  if 
more  than  one  receiver  wish  to  respond  to  the  SP,  the  trans- 
missions will  overlap  at  the  searching  device.  The  station, 
on  the  other  hand,  transmits  the  RSP  immediately  after  re- 
ceiving the  SP  (Note  that  if  we  assume  zero  processing  time 
then  the  channel  is  idle  at  this  time  since  otherwise  the 
SP  would  not  have  been  correctly  received) . The  above  enables 
searching  devices  within  range  of  an  idle  station  to  communicate 
directly  with  the  station. 

A repeater  makes  one  attempt  to  transmit  a RSP  and 
if  the  channel  is  busy  it  discards  it,  rather  than  store 
the  RSP  for  future  transmissions.  This  allows  control  of 
the  level  of  terminal  blocking  by  specifying  the  number 
of  transmissions  of  the  SP  by  a terminal.  Thus,  when  the 
system  is  conjested  in  the  geographical  neighborhood  of 
the  terminal  it  will  not  be  able  to  "enter"  the  system. 

This  feature  also  makes  repeaters  more  available  for  handling 
information  packets. 

* Directed  Routing  (One  Level  Labels) 

Directed  routing  is  a simplified  version  of  the  hierar- 
chical labeling  algorithm  in  which  the  only  information  preserved 
is  the  direction  TO  or  FROM  station.  Repeaters  are  assigned 
labels  that  indicate  the  hierarchy  level,  or  the  number 
of  hops  to  the  nearest  station.  When  a device  transmits 
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a packet,  the  packet  is  addressed  to  all  repeaters  (stations) 
that  are  closer  to  the  destination  than  the  transmitting  device. 
Many  devices  can  receive  the  same  packet  and  it  may  arrive  at 
more  than  one  station.  The  acknowledgement  schemes,  the 
station-station  protocol,  and  the  station-terminal  protocol 
must  then  resolve  this  problem. 

• Flooding  Algorithm  (Plus  Repeater  Memory) 

In  this  algorithm,  there  is  no  directionality  of  trans- 
mission. A packet  is  addressed  to  all  devices  that  can  hear 
it.  To  control  the  problems  of  cycling  and  looping,  repeaters 
are  assigned  storage  for  unique  identifiers  of  packets  that 
they  recently  repeated.  When  a packet  is  received  by  a re- 
peater, it  compares  its  identifier  with  those  stored  and 
discards  the  packet  if  a match  occurs.  A maximum  handover 
number  (MHN)  in  the  packet  will  prevent  it  from  being  propagated 
for  very  long  distances.  This  feature  is  also  used  in  the 
other  routing  algorithms. 

C.  Acknowledgement  Schemes 

The  acknowledgement  scheme  has  particular  significance 
in  this  system  because  of  the  broadcast  feature  and  the  limited 
capability  of  the  repeater  for  processing  and  storage.  The  following 
acknowledgements  are  used: 

• End-to-end  acknowledgement  (ETE  Ack)  between 
station  and  terminal  to  ensure  message  integrity. 

The  frequency  and  precise  meaning  of  the  Ack  depends 

on  the  particular  protocol  used,  and  is  part  of  the  protocol. 
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A hop-by-hop  passive  echo  acknowledgement 
(HBH  Echo  Ack)  along  the  path.  When  device  i trans- 
mits a packet,  it  waits  a sufficient  time  to  allow 
devices  that  receive  the  packet  to  repeat  it.  When 
any  of  these  repeats  the  packet  and  the  packet  is 
received  by  device  i,  it  considers  it  as  an  Ack. 

In  a point-to-point  network,  such  as  the  ARPANET,  the  channels 
are  fixed  so  that  when  node  j receives  a packet  on  channel  k it 
knows  the  device,  say  i,  which  has  transmitted  the  packet  to  it, 
and  thus  can  transmit  a specific  HBH  Ack  to  device  i.  in  the 
Packet  Radio  System,  however,  this  information  is  not  available. 
Therefore,  the  HBH  Ack  must  be  independent  of  the  path  the 

Backet  travels  on.  The  HBH  Echo  Ack  test  used  included^ 
following : 

1)  identification  of  the  packet 

«LnteSS  that  the  MHN  of  the  Echo  received  is 
smaller  than  the  MHN  of  the  packet  stored.  This 

path^rather  S’?  h3S  advanced  al°"^  tta 

JL , • ' raJJjef  Jiian  bein9  a retransmission  from 
evices  that  had  the  packet  previously. 

The  HBH  Echo  Ark  has  several  advantages  over  a specific  HBH  Ack 

iire)isoSjhi;ii(teS  repeater  (hardware  and  soft- 
5 ' so  that  it  need  not  constract  and  manaqe 
acknowledgement  packets.  ” 

specif  *5 6 trafEiC  overhead  of  transmitting 

in  l ht  acknowledgements.  This  is  most  significant 
m a broadcast  network.  y anr 
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3)  it  enables  acknowledgement  of  several  devices  at 
a time;  in  particular,  all  devices  which  store  the 
packet  with  a MHN  larger  than  that  received  are 
acknowledged. 


4)  It  enables  shortening  the  transmission  path,  as 
described  below. 


Since  the  RSP's  by  repeaters  are  randomized  in  time,  a terminal 
frequently  does  not  identify  the  repeater  nearest  to  the  station 
within  range  of  the  terminal.  In  fact,  if  two  repeaters  are  labelled 
on  the  same  path  to  the  station  and  both  are  within  range  of  the 
terminal,  there  is  a higher  probability  that  the  terminal  will  identify 
the  repeater  farther  away  from  the  station  since,  on  the  average,  the 
latter  handles  less  traffic.  Suppose  a single  data  rate  channel 
is  used  throughout  the  transmission  path  between  terminal  and 
station,  the  terminal  identifies  a packet  transmitted  to  it  by 
its  specific  terminal  ID,  and  the  station  can  recognize  any  packet 
destined  for  it.  Then  a communication  path  as  shown  in  Figure  2 
may  be  established.  In  Figure  2,  the  terminal  is  within  an  effective 
range  to  R4 , R5,  and  R6 ; and  the  station  within  an  effective  range 
to  Rl,  and  R2.  The  terminal  shown  identified  R6  as  the  repeater 
to  which  it  transmits.  The  path  from  termianl  to  station  will 
usually  be  : T-R6-*R5+R4  -R3+R2-S ; and  from  the  station  to  the  terminal 
S+R2-R3+R4-T.  The  end  devices,  terminal  and  station,  transmit  the 
Echo  Ack  immediately  after  receiving  the  packet,  and  transmit  it  with 
MHN=0;  thus  they  acknowledge  all  devices  which  still  store  the 
packet.  In  particular,  when  R4  transmits  a packet  towards  the 
terminal  it  is  addressed  to  R5,  however,  the  terminal  may  receive 
this  packet  and  acknowledge  both  R4  and  R5  simultaneously.  Similarly, 
when  R2  transmits  towards  the  station  it  addresses  the  packets  to 
Rl,  when  the  packet  is  received  by  the  station  it  acknowledges 
both  Rl  and  R2  simultaneously. 
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D.  Performance  Measures 


• Throughput 

Considering  the  set  of  stations  and  the  set  of  terminals 
as  the  end  devices,  the  system  throughput  (in  packets)  is  de- 
fined as  the  rate  of  information  packets  (IP's)  that  originated 
at  stations  and  arrived  at  terminals,  plus  the  rate  of  IP's 
that  originated  at  terminals  and  arrived  at  stations. 


Delays 


Th  i following  delays  are  measured: 


1)  Terminal  delay  to  identify  specific  repeater. 

2)  Terminal  delay  to  establish  communication  with 
station  and  to  negotiate  protocols. 


3)  End-to-end  delay  for  an  IP. 


4)  Terminal  interaction  delay  as  a function  of  the 
number  of  IP|s  transmitted  and  received.  The  interaction 
delay  is  defined  as  the  time  elapsed  from  terminal 
origination  to  departure. 


• Blocking  and  Loss 

When  a terminal  uoes  not  successfully  identify  a repeater 
(or  station)  after  transmitting  an  S"  for  the  maximum  number 
of  times  specified,  it  is  considered  blocked.  In  addition. 
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under  certain  conditions,  terminals  wil.  depart  from  the 
system  without  completing  communication.  This  will  contribute 
to  additional  system  loss.  The  blocking  and  loss  are  measured 
separately  since  the  former  indicates  the  difficulty  in 
entering  the  system,  whereas  the  latter  reflects  on  the 
inefficiency  of  the  routing. 

Device  Performance 

1)  Probability  that  the  station  is  busy.  The  station 
is  sampled  during  the  simulation  and  is  assumed  busy 

if  it  is  actively  receiving  or  transmitting;  otherwise, 
it  is  assumed  idle.  Thin  measure  is  an  indication  of 
the  channel  traffic  at  the  station. 

2)  Successful  co.aplet ions  by  repeaters.  The  number 
of  packets  that  each  repeater  has  successfully  switched 
are  counted.  This  indicates  the  distribution  of  load 
in  the  network  and  reflects  on  the  power  duty  cycle 

of  repeaters. 

Other  Measures 


1)  The  number  of  terminals  in  the  system,  the  total 
number  of  packets  stored  in  the  system,  the  number  of 
events  to  be  processed;  all  as  a function  of  time. 

2)  The  complete  output  includes  a detailed  description 
of  the  flow  of  significant  communication  events. 
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TYPICAL  COMMUNICATIONS  PATH 


FIGURE  2:  The  solid  lines  indicate  the  labelled  path 

between  the  repeaters  and  station.  The  dashed  lines  in- 
dicate tne  effective  connectivity  of  terminal  and  station 
to  repeaters.  The  arrows  indicate  the  practical  trans- 
mission path  for  the  particular  terminal,  to  and  from  the 
station . 
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VII.  LOGICAL  OPERATION  OF  DEVICES 


A.  States  of  Devices 


Each  device  is  characterized  by  a state  vector.  Some  of 
the  state  variables  will  be  needed  in  the  physical  devices,  for  ex- 
ample, the  label,  a parameter  indicating  the  maximum  number  of  trans- 
missions, the  maximum  handover  number  to  be  assigned  by  repeater  and 
station,  the  state  of  occupancy  of  its  storage,  etc.  Other  variables 
are  particular  to  the  simulation.  The  following  are  examples: 

Operational  State  of  Device 

PR  - Passive  Receive  State:  The  device  is  in  receive  state 

and  does  not  sense  any  carrier. 

AR  - Acrive  Receive  State. 

AT  - Active  Transmit  State. 

ART-  Active  Receive  and  Transmit. 

When  a device  is  in  state  AR  or  ART,  it  can  be  receiving 
several  overlapping  packets  simultaneously.  In  the  program,  we  use 
a common  channel  configuration  (half  duplex).  Thus,  since  carrier 
sense  is  used  for  channel  access,  the  device  can  change  to  AT  only 
from  PR  and  to  ART  only  from  AT. 
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* Number  of  Overlapping  Packets 

This  number  is  incremented  by  one  whenever  the  device  is 
in  state  AR  of  ART  and  a new  packet  begins  to  arrive;  and 
decremented  when  a packet  transmission  ends.  The  number  of 
overlapping  packets  indicates  the  number  of  times  an  end  of 
packet  transmission  has  to  occur  before  the  device  changes  its 
state  to  PR. 

* End  of  Busy  Period 

This  time  is  recorded  for  the  ourpose  of  saving  CPU  time. 
The  transmission  time  of  a packet  is  set  to  the  End  of  Busy 
Period  plus  a random  time?  otherwise  the  devide  may  be  called 
to  transmit  a packet  several  times  during  its  busy  period. 


B.  Terminal 


When  a terminal  originates  a message,  it  begins  to  transmit 
and  retransmit  a SP  to  identify  a specific  receiver.  If  it  does  not 
identify  a specific  receiver  after  a specified  number  of  transmissions, 
it  departs  from  the  system.  We  say  that  such  a terminal  is  blocked. 
When  a terminal  identifies  a specific  receiver,  it  substitutes  the 
label  and  MHN  sent  by  the  receiver  itno  its  IP  and  begins  to  transmit 
its  IP.  The  IP  is  retransmitted  after  short  waiting  periods  of  time 
until  a HBH  Echo  Ack  is  received.  At  that  time,  the  terminal  stores 
the  IP  ior  a longer  period  of  waiting  afner  which  the  IP  is  reactivated 
if  an  ETE  Ack  is  not  received.*  The  terminal  is  expecting  several  IP's 


* We  use  the  term  retransmission  when  a device  waits  a relatively  short 
period  of  time  (less  then  2 IP  slots)  and  is  awaiting  a HBH  acknowledge- 
ment. We  say  that  a packet  is  reactivated  when  an  end  device  stores 
the  packet,  awaiting  an  ETE  Ack.  When  a packet  is  reactivated,  it  goes 
through  the  whole  process  of  retransmissions. 
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from  the  station,  which  are  ETE  acknowledged  by  the  terminal.  When 
all  IP's;  from  the  station  are  received  and  ETE  acknowledged,  the 
terminal  departs  from  the  system. 

C . Repeater 

A repeater  does  not  distinguish  between  IP's  or  ETE  Acks, 
except  for  their  transmisison  time.  When  an  IP  (or  ETE  Ack)  is  re- 
crived  by  a repeater  (addressed  to  it)  and  the  repeater  has  available 
storage,  it  stores  the  packet,  decrements  the  MHN , modifies  the  packet 
label  according  to  the  routing,  and  begins  to  transmit  and  retransmit 
the  packet,  awaiting  the  Echo  Ack.  When  an  Echo  Ack  is  received,  the 
repeater  discards  the  packet.  When  a repeater  is  not  successful  in 
transmitting  along  the  "shortest"  path,  it  begins  to  search  for  an 
alternate  receiver  by  transmitting  SP's.  When  one  is  found,  it  trans- 
mits the  entire  packet  to  it;  otherwise,  it  discards  the  Dacket.  When 
a repeater  receives  an  SP , it  checks  whether  it  has  available  storage, 
if  it  does,  it  makes  one  attempt  to  transmit  a RSP  and  then  discards 
it.  When  a repeater  receives  a RSP,  it  tests  whether  it  needs  one,  if 
it  does,  it  used  the  label,  otherwise  it  discards  it.  The  repeater 
currently  simulated  has  buffer  storage  for  tv/o  packets:  one  exclu- 
sively used  for  packets  directed  towards  the  station,  and  the  second 
for  packets  toward  the  terminal.  In  addition,  the  repeater  can  in- 
spect all  packets  that  is  receives,  which  are  stored  in  common  arrays 
in  the  simulation  program.  Thus,  from  a practical  viewpoint,  buffer 
storage  for  three  packets  per  repeater  are  provided  in  the  simulation 
program. 
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D . Station 

The  storage  organization  in  the  simulated  station  is  shown 
in  Figure  3.*  There  are  two  queues  for  active  packets.  Packets  in 
these  queues  are  active  in  the  sense  that  they  are  retransmitted  after 
short  random  periods  of  waiting  until  an  Echo  Ack  is  received.  The 
active  queue  for  long  packets  contains  IP's  from  the  station  to  ter- 
minals. Once  an  IP  is  Echo  acknowledged,  it  is  stored  in  the  passive 
queue  for  a longer  period,  after  which  it  is  reactivated  if  an  ETE 
Ack  from  the  terminal  is  not  received.  The  active  queue  for  short 
packets  contains  ETE  Acknowldegemen t packets  to  terminals,  and  these 
P_r,ionty  over  the  long  active  apckets.  The  ETE  Ack  packets  are, 
obviously,  discarded  once  an  Echo  Ack  is  received.  The  point-to-point 
(PTP ) network  queue  simulates  the  interaction  of  the  packet  radio  net- 
work with  a PTP  network.  When  a new  IP  is  received  from  a terminal, 
it  13  stored  in  the  PTP  network  queue  fir  a random  time,  after  which 
a response  message  containing  several  response  IP's  to  that  terminal 
are  generated  and  placed  into  the  active  queue  for  long  packets.  The 
station  responds  immediately  to  SP ' s , and  ignores  RSP's. 

E . Flow  Diagrams  of  Devices 

Figures  4,  5,  and  6 show  the  flow  diagrams  of  the  devices 
used  in  the  simulator.  These  diagrams  are  simplified  to  the  extent 
that  they  show  "what  to  do"  but  not  sufficiently  detailed  to  show 
How  to  do  it."  The  latter  depends  on  the  particular  system  simulated, 
i.e.,  the  routing,  the  channel  configuration,  etc. 

A device  is  called  from  the  subroutine  EVENT;  the  calling 
sequence  includes,  among  others,  an  interrupt  number  which  indicates 
to  the  device,  the  task  that  is  has  to  perform.  The  only  event  which 


* Figure  3 shows  only  the  storage  of  IP's  and  ETE  Acks  for  transmissic 
to  terminals  1 " the  packet  radio  network. 
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is  external  to  a device  is  that  with  an  interrupt  = 1.  All  other 

Thus^th10  3 "V1Ce  are  dUe  t0  eVentS  9enerated  by  tha  device  itself. 

hus,  the  number  of  exogeneous  events  is  very  small  compared  to  the 
number  of  events  a^nor-ad-^  ..  , ^ tne 

data  rate  to  V eV1CCS;  in  Particular,  when  the  offered 

data  rate  to  the  system  is  high. 

hi  u “ 1S  Cle""  that  When  the  °ffered  b^ffic  rate  to  the  system  is 

Putin  t6"  W1  te  I”anY  C°Uisions  of  Pacl<ets.  Thus,  to  save  com- 
Ptrng  time , eacn  device  was  coded  so  that  it  does  not  examine  the 

content  of  the  Packet  or  whether  the  packet  is  addressed  to  it.  at  the 

l 9 n °f  PaCket  diagrams,.  This  is  done  at  the  end 

of  packet  reception,  providing  there  was  no  interference. 

There  are  parts  of  the  subroutines  of  Repeater,  Station,  and 

::r0tch  r ldenticai  and  thus'  c°dea  int°  — bi—  TheSe 

elate  to  the  identification  of  the  header  content,  and  the  channel 
cess  mooe.  Some  of  these  can  be  associated  with  the  modem  of  the 
Physical  devices.  The  parts  which  differ  involve  mainly  the  proces- 

” ^ teen  ldentmed'  --ailg  a!  the 

station,  ETE  Ack  by  end  devices,  etc. 
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TERMINAL  DEVICE  FLOW  DIAGRAM 
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FIGURE  5B 

STATION  DEVICE  FLOW  DIAGRAM 


7.37 


Network  Analysis  Corporation 


r-IGU RE  5C 

STATION  DEVICE  FLOW  DIAGRAM 


7.38 


STATION  DEVICE  FLOW  DIAGRAM 


Network  Analysis  Corporation 


Packet 


Correct 


Is  it 


J . 


ETE  ACK 


Packet 


you 


''Packet 
Type  ? 


Packet 

slype-? 


Drop  Packet 


Acknowledgec 


/Generate 
Event  for 
Packet  Rt— 
\activatioi 


If- 


REPEATER  DEVICE  FLOW  DIAGRAM 


Network  Analytic-  Corporation 


REPEATER  DEVICE  FLOW  DIAGRAM 


Network  Analysis  Corporation 


7.43 


r 


FIGURE  6C 


Network  Analysis  Corporation 


VIII . 

A. 

EVENT 

INHEAP 

INMESS 

INPUT 

MESS PEL 

NEWMESS 

NEWPACK 

NXTEVNT 

OUT 

OUTHEAP 

PRSIM 


SUBROUTINES  OF  THE  SIMULATOR 


Data  Structure  and  Management  Subroutines 

Takes  the  next  event  out  of  the  Event  Data  Structure  for 
execution . 

Adds  an  event  to  the  heap  in  the  Event  Data  Structure. 

Allows  the  introduction  of  special  messages  such  as  ack- 
nowledgements, control  messages  and  the  like  into  the 
Message  Data  Structure. 

Reads  the  input  parameters  and  determines  the  placing  of 
repeaters  and  stations. 

Is  called  by  device  routines  to  release  a message  as  soon 
as  all  packets  representing  the  message  are  deleted. 

Generates  next  exogenous  message  and  adds  to  the  Message 
Data  Structure. 

Adds  a new  packet  to  the  Packet  Data  Structure. 

Adds  a new  event  to  the  Event  Data  Structure. 

Prints  out  intermediate  data  for  debugging. 

Takes  the  index  of  the  next  event  time  from  the  heap. 

The  driver  routine.  (main  program) 
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MEASURE 

MCOUNT 

B. 

REPEAT 

STATION 

TERMINAL 

DEVINIT 

BGNPCV 

ENDRCV 

ECHO 

SRER 

ALTROUT 

SRTTRT 


Collects  data  on  system  performance. 

Counts  the  number  of  packets  associated  with  each  terminal 
which  are  stored  in  the  system. 

Communication  and  Device  Subroutines 

Main  subroutine  of  repeater. 

Main  subroutine  of  station. 

Main  subroutine  of  terminal. 

Reads  parameters  which  devine  the  particular  communication 
system,  labels,  and  flow  control  parameters.  Initializes 
states  of  devices. 

Maintains  states  of  devices  related  to  the  RF  channel  (e.g., 
number  of  overlapping  packets). 

Same  as  above  at  the  end  of  packet  reception. 

Records  that  a device  is  receiving  an  echo  acknowledgement. 

Called  when  a non-overlapping  packet  is  received  for  testing 
packet  type  and  label. 

Called  after  repeater  receives  an  RTS,  checks  whether 
repeater  needs  one,  and  has  not  used  one  before. 

Transmits  packet  and  generates  an  event  for  the  end  of 
packet  transmission. 
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REPNEXT 

TERSTOR 

SNREEKO 

SHIFTQ 

SNREPAK 


RESPONS 

SEKOTRT 

CONNECT 


Determines  which  packet  of  a repeater  is  to  be  transmitted 
next . 

Stores  correct  IP's  received  by  terminal  and  generates 
event  for  transmission  of  an  ETE  Ack. 

Called  by  station  after  receiving  an  Echo  Ack.  Identifies 
and  maintains  the  queue  in  which  the  acknowledged  packet  is 
stored.  If  it  is  in  IP,  then  it  transfers  the  packet  to 
another  queue  where  it  waits  for  an  ETE  Ack  or  for  reacti- 
vation . 

Shifts  packets  in  the  various  queues  of  the  station. 

Called  by  station  after  correctly  receiving  a packet.  If 
the  packet  is  an  ETE  Ack,  then  subroutine  drops  the  packets 
acknowledged , maintains  proper  queues  and  the  message  counts. 
If  it  is  an  IP,  subroutine  verifies  that  same  packet  has  not 
been  received  before,  and  if  so,  it  generates  packet  and  an 
event  for  transmission  of  an  ETE  Ack,  and  also  generates  a 
random  time  and  an  event  for  the  arrival  of  the  response 
message  from  the  PTP  network. 

Called  by  station,  sets  all  response  packets  to  a terminal 
into  the  active  queue  and  generates  events  for  transmitting 
them. 

Used  by  station  to  transmit  an  Echo  Ack  for  the  last  hop. 

Determines  the  most  efficient  repeater  to  which  station 
should  address  packet  when  transmitting  to  a terminal. 
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TRANSMT  Called  by  a device  which  transmits  a packet.  Puts  packet 
in  list  structure;  determines  all  the  devices  that  should 
receive  the  packet,  the  exact  time  for  beginning  to  receive 
it;  and  generates  the  events  to  devices. 


C. 


Ack 

AR 

ART  - 
AT 

ETE  - 
HBH  - 
IP 


Summary  of  Acronyms 

Acknowledgement 

Active  Receive 

Active  Receive  and  Transmit 

Active  Transmit 

End-to-end 

Hop-by-hop 

Information  packet 


Label- 


An  address  assigned  to  a device  for  routine  purposes 


MHN 


Maximum  handover  number 


MNT 


Maximum  number  of  transmissions 


RP 


Passive  Receive 


PTP 

RSP 

SP 


Point-to-point 
Response  to  search  packet 
Search  packet 
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IX*  -D-?.FRVATIQN  QF  TRAFFIC  FLOW  IN  THE  PACKET  RADIO  NETWORK 

The  first  system  simulated  was  a Common  Channel  Single  Data  Pate 
system,  in  which  «-he  station  is  routing  traffic  as  a repeater  (Waive 
Station).  We  denote  the  system  as  CCSDR  (NS).  The  system  defined 
has  a single  data  signalling  rate  for  communication  between  terminal 
and  repeater  (or  station)  and  in  the  repeater-station  network;  the 
channel  is  used  in  a half  duplex  mode.  When  the  station  is  routing 

traffic  as  a repeater,  it  cannot  receive  packets  not  specifically 
addressed  to  it. 

In  all  experiments  reported  here,  the  labels  of  repeaters  and 
staiton  were  preassigned.  The  hierarchical  (directed)  labelling 
scheme  of  the  system  in  uhis  experiment  are  shown  in  Figure  7.  Fig- 
ure 8 shows  the  connectivity  of  the  repeaters  and  station.  That  is, 
when  a device  transmits,  all  the  devices  connected  to  it  by  line  are 
within  an  effective  range  and  "hear"  the  transmission. 

The  objective  of  the  first  series  of  experiments  was  to  observe 
the  detailed  operation  of  devices  and  the  efficiency  of  the  system. 
The  following  observations  were  made: 

1*  The  "critlcal  hop"  in  the  system  is  that  between  the 
first  level  repeaters  and  the  station.  This  was  concluded 
by  observing  the  frequency  at  which  repeaters  begin  to  search 
and  at  which  they  discarded  packets,  and  from  the  obser- 
vation that  there  is  no  significant  difference  in  the  delay 

when  the  number  of  hops  from  the  station  that  a packet  travels 
is  increased. 
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2.  There  is  a higher  probability  of  end-to-end  successful 
completion  when  routing  from  the  station  to  a terminal  than 
when  routing  from  a terminal  to  the  station.  F-actically, 
there  is  almost  "no"  difference  in  time  delay  between  the 
delay  of  an  information  packet  from  the  terminal  arriving  at 

the  station  and  the  time  that  the  terminal  receives  an  ETE 
Ack  from  the  station. 


3.  Many  packets  associated  with  terminals  that  have  de- 
parted from  the  system  are  routed  in  the  network. 

The  effect  of  improving  the  routing  capaciblities  of  the  station 
can  be  readily  observed.  In  particular,  one  can  see  in  Figures  7 and 

laben  r:"  station  is  7,  there  are  only  4 repeaters 

labelled  from  the  station.  Consequently,  the  station  is  busy  of  the 

time  with  non-useful  traffic.  This  situation  can  be  improved  by 
changing  the  routing  of  the  station  so  that:  (1)  it  receives  any 
packet  that  it  can  hear  and  which  is  (eventually)  addressed  to  if 
and  (2)  it  transmits  response  packets  to  the  repeater  nearest  to  the 
terminal  along  the  routing  path  that  it  can  reach.  This  change  was 
imp  emented  for  all  system  studies  subsequent  to  the  initial  experi- 
ments. Apart  from  the  change  implemented,  the  observation  suggests 
that  particular  attention  should  be  given  to  the  design  of  the  re- 
peater  network  in  the  neighborhood  of  the  stations.  It  is  also  noted 
that  these  repeaters  have  a higher  power  duty  cycle  since  they  handle 
all  P.  ckets  collected  from  other  parts  . ,e  network.  The  routing 
change  made  at  the  station  enables  the  allocation  of  many  more  re- 
peaters in  the  neighborhoos  of  the  station,  than  are  functionally 
needed,  without  resulting  an  increase  in  the  artivicial  traffic  gen- 
erated. The  exact  labelling  of  these  repeaters  is  also  not  critical. 
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°ne  0t  the  rGasons  loading  to  observation  2 is  that  the  station  has 
ca  higher  probability  then  the  first  level  repeaters  of  successful 
transmission  over  the  critical  hop,  because  it  is  the  largest  user 
and  dots  not  interfere  with  its  own  transmissions.  Theoretically, 
one  may  expect  a similar  conclusion  when  considering  transmission 
in  a section  of  the  network  in  which  two  repeaters,  one  of  which 
"homes"  on  the  other,  compete.  This,  however  may  not  be  realized 

in  the  system  simulated  because  of  the  limited  storage  available  in 
repeaters . 

Observations  2 and  3 suggest  a change  in  the  Terminal-Station 
protocol.  The  basic  question  is  whether  a terminal  should  release 
itself  from  the  system  or  whether  it  should  be  released  by  the  station 
The  former  was  initially  simulated.  it  was  observed  that  in  many 
cases,  a terminal  departed  from  the  system  after  receiving  an  Echo 
to  the  ETE  Ack  for  the  last  IP  without  this  ETE  Ack  arriving  at  the 
station.  This  resulted  in  the  reactivation  if  IP's  by  the  station 
for  this  terminal,  the  routing  of  these  packets  in  the  net,  and  then 
the  maximum  number  of  transmissions  and  search  by  the  repeater  nearest 
to  the  terminal.  The  protocol  simulated  in  the  systems  discussed 
later  is  such  that  the  last  packet  must  always  be  from  the  station  to 
the  terminal.  This  transmission  may  be  considered  as  a terminal  re- 
lease packet.  Another  change  in  protocol  implemented  is  that  when- 
ever possible,  the  terminal  acknowledges  a sequence  of  packets  rather 

than  individual  ones,  to  reduce  the  overhead  in  the  direction  towards 
the  station. 
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THE  TRADEOFF  BETWEEN  TRANSMISSION  RANGE  OF  DEVICES 
AND  NETWORK  INTERFERENCE 


For  the  experiments  discussed  in  the  previous  section,  it  was 
assumed  that  Repeater-Repeater  range  is  the  same  as  Terminal-Repeater 
range.  This,  however,  is  not  always  a realistic  assumption  since 
repeater.1  can  fce  placed  on  elevated  areas  and  can  have  more  power 
then  terminals,  (especially  hand  held  terminals) . Thus,  if  re- 
peaters are  allocated  for  area  coverage  of  terminals,  the  repeater 
range  will  be  higher  than  terminal  range  and  higher  network  con- 
nectivity or  device  interference  will  result. 

The  problem  which  then  arrises  is  to  determine  the  impact  of 
this  interference  on  system  performance.  Alternatively,  one  may 
seek  to  reduce  repeater  transmission  power  when  transmitting  in 
the  repeater-station  network.  To  study  this  issue,  two  CCSDR  systems 
were  simulated,  one  with  high  interference  CCSDR  (HI),  and  the  other 
with  Low  Interference  CCSDR  (LI).  The  routing  labels  of  the  two 
systmes  were  the  same  and  are  shown  in  Figure  7.  The  interference 
of  the  CCSDR  (LI)  system  is  shown  in  Figure  8 and  the  interference 
of  the  CCSDR  (HI)  system  in  Figure  9.  Figure  9 shows  only  the 
connectivity  of  two  devices  in  the  network. 

The  results  are  shown  in  Figure  10  and  Table  1.  Figure  10 
shows  the  throughput  of  the  two  systems  as  a function  of  time 
while  Table  1 summarizes  all  other  measures  of  performance.  The 
third  row  of  Table  1 summarizes  performance  of  the  high  interference 
system  under  an  improved  set  of  repeater  labels.  This  experiment 
is  discussed  in  detail  in  the  next  section.  It  is  clear  that  the 
high  interference  system  is  much  better  than  the  low  interference 
system.  The  only  measure  of  the  low  interference  system  which  is 
better  is  terminal  blocking  which  is  a direct  result  of  the  low 
interference  feature.  In  fact,  CCSDR  (LI)  is  saturated  at  the 
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offered  traffic  rate.  This  can  be  seen  from  the  fact  that  the 
throughput  is  decreasing  as  a function  of  time;  the  relatively 
high  total  loss;  and  the  low  station  response*.  The  CCSDR  (HI) 
with  improved  labels  compared  in  Table  1,  is  better  than  the 
other  two  systems.  This  demonstrates  the  importance  of  proper 
labelling.  The  experiments  of  this  section  demonstrate  that  it 
is  preferable  to  use  high  transmitter  power  to  obtain  long  re- 
peater range,  despite  the  network  interference  that  it  results. 


* The  average  number  station  response  packets  assumed  for  these 
studies  is  2.0. 
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XI . SINGLE  VERSUS  DUAL  DATA  SIGNALLING  RATES  NETWORKS 

The  results  cf  the  previous  section  demonstrate  that  a better 
performance  system  is  obtained  when  repeaters  and  station  use  hiqh 
power  to  obtain  long  range  despite  the  interference  that  results. 

We  now  examine  the  problem  of  whether  repeaters  and  station  should 
use  their  fixed  power  budgets  to  obtain  a long  ranqe  with  a low 
drta  rate  channel  or  have  a short  range  with  a hiqh  data  rate 
channel.  The  following  systems  were  studies. 

• A CCSDR  (HI)  of  the  previous  section  with  improved 
labels,  which  we  denote  by  CCSDR.  That  is,  we  take 
advantage  of  the  high  range  to  improve  the  routing  labels 
of  repeaters  and  obtain  fewer  hierarchy  levels.  The 
routing  labels  used  are  shown  in  Figure  11,  and  the 
connectivity  is  shown  in  Figure  9. 

• A Common  Channel  Two  Data  Pate  (CCTDR)  system  with  the 
routing  labels  as  in  Figure  7 and  connectivity  as  in  Figure  8. 

In  the  CCTDR  system,  the  terminal  has  a low  data  rate  channel, 
the  same  rate  as  in  the  single  data  rate  system,  for  communication 
with  a repeater  or  station.  Repeaters  and  station  have  two  data 
rates.  The  high  data  rate  is  used  for  communication  in  the 
repeater-station  network.  The  two  data  rates  use  the  same  carrier 
frequency  so  that  only  one  can  be  used  at  a time. 

The  two  systems  are  tested  with  offered  rates  of  13%  and  ?5i.* 
The  throughput  as  a function  of  time  for  the  two  runs  are  shown  in 
Figures  12  and  13,  respectively;  and  the  summary  of  other  measures 
is  given  in  Table  2.  The  comparison  demonstrates  that  the  CCTRD 


* In  the  simulation  runs  we  used  the  inverse  sauare  low  for  the  re- 
lation between  data  rate  and  distance,  rather  than  the  result  in 
[9];this  however,  favors  CCSDR. 
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system  is  superior  to  the  CCSDR  system,  in  terms  of  throughput, 
delay,  and  other  measures.  One  can  see  that  the  CCSDR  system  is 
saturated  at  an  offered  rate  of  about  13%. 

* Effect  on  Blocking  Level 

In  Table  2,  one  can  see  that  one  reason  for  the  relatively 
low  throughput  of  the  CCSDR  system  at  an  offered  rate  of  25% 
is  due  to  blocking.  Furthermore,  the  fraction  of  time  that 
the  station  is  busy  has  decreased.  This  may  suggest  that  the 
stat'on  may  be  able  to  handle  more  terminals  providing  they 
are  able  to  enter  the  system.  To  examine  this  point,  we  ran 
the  CCSDR  system  with  offered  rate  of  25%,  and  relaxed  the 
constraint  for  entering  the  system.  Rather  than  resulting  in 
better  performance,  this  step  resulted  in  reduction  in  blocking 
and  increase  in  delay.  The  throughput  increased  to  12.63%, 
the  blocking  decreased  to  18.35%  and  the  total  loss  decreased 
to  30.73%.  On  the  other  hand,  the  delay  increases  to  57.82, 
the  fraction  of  time  the  station  is  busy  increased  to  .57, 
ana  the  rate  of  station  response  decreased  to  1.32. 

To  conclude,  when  we  enabled  more  terminals  to  enter  the 
system,  the  throughput  increased  insignificantly,  from  12.20% 
to  12.63%;  on  the  other  hand,  the  average  packet  delay  increased 
significantly,  from  34.97  to  57.82  terminal  slots.  This 
suggests  that  one  of  the  important  design  problems  in  the 
packet  radio  system  is  the  blocking  level  of  terminals. 
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THROUGHPUT  VS.  TERMINAL  SLOTS:  13%  RATE 
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XII.  PRELIMINARY  RESULTS  OF  MAXIMUM  THROUGHPUT,  LOSS , AND  DELAY 

OF  CCS DR  AND  CCTDR  SYSTEMS 

In  the  packet  radio  system  there  is  an  absolute  maximum  through- 
put (independent,  of  loss  and  delay)  because  of  the  interference 
characteristics . Similar  to  curves  of  throughput  versus  channel 
traffic,  when  the  relation  is  known  analytically  [7] , we  draw  the 
curves  of  system  throughput  vs.  offered  rate  for  estimating  the 
maximum  throughput. Figure  14  shows  the  throughput  versus  offered 
rate  for  CCSDR  and  CCTDR  systems.  The  curves  are  linear  for  low 
offered  rates  and  saturate  when  the  offered  rate  increases. 

For  the  CCSDR  system  one  can  see  that  the  throughput  is 
practically  the  same  when  the  offered  rate  is  increased  from  13%  to 
25%.  This  and  the  other  measures  (see  Table  2),  (for  example,  the 
rate  of  station  response)  show  that  the  system  is  overloaded  at  a 
25%  offered  rate.  On  the  other  hand,  the  system  seams  to  operate 
at  steady  state  at  an  offered  rate  of  13%  (rate  of  station  response 
2.06).  A rough  estimate  of  maximum  throughput  for  this  system 
would  be  between  12%  and  15%.  Similar  observations  of  the  per- 
formance measures  lead  to  an  "estimate"  of  between  27%  and  30%  for 
the  maximum  throughput  of  the  CCTDR  system. 

The  average  delay  of  the  first  Information  Packet  from  terminal 
to  station,  and  the  Total  Loss,  as  a function  of  offered  rate  are 
shown  in  Figure  15  and  Figure  16,  respectively. 

Remark:  There  are  many  parameters  in  the  simulation  program 

which  we  have  not  experimented  with  (or  tried  to 
optimize)  and  which  affect  the  quantities 
discussed  above.  One  parameter  which  is  signif- 
icant in  determining  the  maximum  throughput 
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is  the  average  number  of  response  packets  from 
station  to  terminal.  The  affect  of  this  parameter 
has  been  analyzed  in  [10]  for  a slotted  ALOHA  random 
access  mode,  it  has  been  shown  that  the  maximum  through- 
P is  increased  in  the  Common  Channel  system  when  the 
rate  of  response  increases , and  the  maximum  throughput 
tends  to  100*  of  the  data  rate  when  the  rate  of  response 
tends  to  infinity.  We  expect  that  this  parameter  has  a 
similar  effect  for  the  mode  of  access  simulated.  In 
the  results  reported  here  the  rate  of  response  is  2.0 
which  is  small  compared  to  usual  estimates  for  terminals 
interacting  with  computers.  Furthermore,  the  relatively 
short  terminal  interaction  increases  the  traffic  over- 
head of  the  search  procedure,  per  information  packet 
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XIII. FUTURE  DEVELOPMENT  OF  THE  PACKET  RADIO  SIMULATOR 


We  outline  several  of  the  future  developments  for  the 
simulator;  some  of  these  are  in  the  implementation  stage. 


Initialization  and  Labelling  of  Repeaters 


Preliminary  experiments  have  shown  that  the  Hierarchical 
Labelling  algorithm  is  much  more  efficient  than  the  other  two 
algorithms  of  Section  VI,  subsection  B.  Consequently , it  is 
recommended  for  implementation  in  the  Packet  Radio  System. 

In  many  cases  however,  the  connectivity  between  devices  in  the 
network  will  not  be  known  apriori.  For  example,  in  a military 
application  one  may  wish  to  establish  a network  by  distributing 
repeaters  at  random  locations,  and  one  may  not  have  physical 
access  to  the  repeaters.  Furthermore,  there  may  be  changes  in 
the  topology"  of  the  network  due  to  variations  in  transmission 
power  of  devices,  or  when  some  devices  cease  to  operate.  Thus, 
it  is  necessary  to  assign  and  reassign  labels  to  repeaters  in 
an  operating  network. 

The  approach  that  we  adopted  is  to  use  the  flooding 
routing  algorithm  to  load  repeaters  with  hierarchical  labels. 
The  flooding  algorithm  was  selected  because  it  does  not  require 
any  knowledge  of  the  topology  of  the  network.  A process  for 
repeater  initialization  and  labelling  which  has  been  detailed 
in  [6]  is  currently  under  implementation.  Initially,  it  is 
assumed  that  the  station  contains  a set  of  fixed  identifiers 
of  repeaters  which  may  possibly  be  connected  into  a network; 
three  stages  are  then  followed.  In  stage  1 the  station  trans- 
mits special  control  packets  to  the  above  repeaters,  and  re- 
peaters respond  with  control  packets  from  which  a connectivity 
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matrix  between  repeaters  is  established.  The  hierarchical 
labels  are  determined  in  staae  2 from  the  connectivity  matrix. 
In  stage  3 the  station  transmits  the  labels  to  repeaters  and 
tests  each  path  in  both  directions,  from  station  to  repeater 
and  from  repeater  to  station. 

• Flow  Control 

Control  packets  for  changing  the  operating  parameters 
of  devices,  and  algorithms  for  using  these  will  be  implemented. 
For  example,  turning  repeaters  "on"  and  "off",  changing  the 
parameter  for  the  maximum  number  of  transmissions,  etc. 

• Access  Modes 


In  [7]  it  has  been  shown  that  one  of  the  important 
parameters  which  affects  the  performance  of  the  carrier  sense 
access  modes  is  the  ratio  of  the  propagation  time  between 
devices  to  the  packet  transmission  time.  Specifically, that  the 
performance  (relative)  deteriorates  when  the  above  ratio 
increases;  which  is  the  case  when  the  data  signalling  rate 
is  increased  and  the  number  of  bits  in  an  information  packet 
is  kept  constant.  Thus  for  some  operating  parameters  the 
carrier  sense  access  modes  may  not  show  a much  better  per- 
formance than  the  more  simple  nonslotted  ALOHA  [11]  random 
access  scheme.  These  problems  will  be  studied  in  a network 
environment  by  simulating  the  latter,  for  comparison,  and  by 
studying  the  carrier  sense  performance  as  a function  of  the 
data  signalling  rate. 
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Directional  Antenna  at  the  station 


Analysis  has  shown  (12J  that  directional  antennas  at  the 
s ation  may  increase  the  system  capacity.  This  can  possibly 
e verified  and  quantified  by  simulating  such  an  antenna. 

* Capture 


Currently  the  non-capture  system  is  simulated,  caputre 
models  which  reflect  the  practical  performance  of  hardware 
are  under  development  and  will  be  simulated. 
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PACKET  RADIO  SYSTEM  CONSIDERATIONS  - 
NETWORK  CAPACITY  TRADEOFFS 


I.  INTRODUCTION 


Packet  Switching  over  radio  channels  with  random  access  schemes 
is  of  current  interest  for  local  distribution  systems  and  for  satellite 
channels.  This  mode  of  operation  is  useful  when  the  communicating  de- 
vices are  mobile  and  when  the  ratio  of  the  peak  to  average  data  rate 
requirement  of  each  device  is  high.  Such  systems  have  been  analyzed 
for  the  case  in  which  all  communicating  devices  are  within  an  effec- 
tive transmission  range  of  each  other;  either  directly  or  through  the 
satellite.  These  analyses  were  originally  done  for  the  ALOHA  system 
[11  and  for  a satellite  channel  [4] . The  models  used  do  not  suffi- 
ciently describe  the  Packet  Radio  System.  The  reason  being  that  in 
the  packet  radio  system  there  is  a network  of  repeaters  which  separate 
an  originating  device  from  a destination  device  (terminals  and  stations) . 

In  this  chapter,  we  address  broadcast  networks  in  which  orginating 
devices  cannot  directly  reach  the  destination  receiver.  Thus,  repeaters 
are  introduced  which  receive  these  packets  and  repeat  them  to  the  des- 
tination. The  capacity  (maximum  throughput)  of  such  systems  is  deter- 
mined, and  design  problems  related  to  the  number  of  repeating  devices 
and  the  usefulness  of  directional  antennas  are  resolved. 

The  model  used  in  this  chapter  can  describe  systems  other  than 
the  packet  radio  system  and  we  discuss  it  in  the  more  general  context. 
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One  way  to  categorize  channel  allocation  schemes  for  data  trans- 
mission is  the  following: 

1.  Fixed  assignment  (FDM,  TDMA) 

2.  Dynamic  assignment  with  centralized  control  (polling 

schemes,  reservation  upon  request) 

3.  Dynamic  assignment  with  distributed  control  (loop 

networks,  random  reservation  schemes) 

4.  No  assignment  (random  access  schemes) 

It  has  been  recognized  that  a fixed  allocation  of  channel  capacity 
is  extremely  wasteful  when  the  traffic  of  users  is  of  a bursty  nature; 
that  is,  when  the  traffic  requirements  of  the  users  can  be  character- 
ized as  having  a high  peak  to  average  data  rate.  Users  can  be  so 
characterized  in  an  inquiry  response  application.  In  fact,  if  one 
characterizes  a set  of  users  by  the  number  and  by  the  ratio  of  the 
peak  to  average  data  rate  requirements  of  each  user,  one  can  conjecture 
that  when  the  above  number  and  ratio  are  increasing,  one  obtains  a 
higher  channel  utilization  when  proceeding  (along  the  categorization) 
from  the  fixed  assignment  to  the  no  assignment  allocation  schemes. 

For  example,  when  the  time  delay  to  make  a reservation  or  the  average 
time  between  two  consequtive  pollings  of  a user  is  large  compared  to 
the  fraction  of  time  that  the  user  wishes  to  use  the  channel,  then  the 
dynamic  assignment  with  centralized  control,  becomes  inefficient  (apart 
from  the  need  for  a system  for  polling  or  making  reservations) . 

Roberts  [7]  has  demonstrated  the  cost  advantages  of  a random  re- 
servation scheme  and  a random  access  scheme  over  fixed  assignment, 
when  the  number  of  users  increases  and  the  average  traffic  requirement 
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per  user  is  kept  constant.  A somewhat  more  formal  justification  for 
sharing  a channel  was  given  by  Kleinrock  and  Lam  [4],  we  quote: 

Rather  than  provide  channels  on  a user-pair  basis,  we  much  prefer 
to  provide  a single  high-speed  channel  to  a large  number  of  users 
which  can  be  shared  in  some  fashion;  this  when  allows  us  to  take 
advantage  of  the  powerful  'large  number  laws'  which  state  that  with 
very  high  probability,  the  demand  at  any  instant  will  be  approximately 
equal  to  the  sum  of  the  average  demands  of  that  population."  Gitman, 
Van  Slyke,  and  Frank  [3),  have  addressed  the  problem  of  splitting  a 
channel  between  two  classes  of  users.  It  was  shown  that  in  almost 
all  cases,  sharing  the  channel  results  in  a higher  utilization  of  the 
total  channel  capacity. 

In  this  chapter,  we  consider  a packet  switching  network  in  which  a 
single  radio  channel  is  shared  by  all  communicating  devices.  Devices 
access  the  channel  using  the  so  called  "slotted  ALOHA"  random  access 
scheme.  When  a random  access  scheme  is  used,  there  is  a possibility 
that  more  than  one  packet  is  simultaneously  received  by  a receiver 
due  to  independent  transmissions  of  several  devices.  In  that  event, 
it  is  assumed  that  none  of  the  packets  are  correctly  received,  and  the 
corresponding  devices  have  to  retransmit  their  packets.  One  can  see 
that  the  number  of  packets  transmitted  is  larger  than  the  number  of 
originating  packets.  That  is,  part  of  the  channel  capacity  is  used 
up  by  the  wasteful  collisions  and  are  not  considered  as  effective 
channel  utilization  since  it  does  not  contribute  to  the  throughput. 

For  example,  if  each  packet  is  transmitted  two  times,  on  the  average, 
before  it  is  successfully  received,  then  the  maximum  effective  util- 
ization of  the  channel  (or  the  effective  channel  capacity)  is  one- 
half  of  the  given  capacity,  the  other  one-half  is  used  up  by  the  non- 
successful transmissions.  The  first  problem  that  one  faces  is  to 
determine  the  maximum  effective  utilization  (or  system  capacity)  that 
can  be  obtained.  This  is  one  of  the  problems  addressed  in  the  chapter. 
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If  the  channel  is  offered  a higher  rate  of  traffic  than  its 
effective  capacity,  the  system  becomes  unstable  in  the  sense  that 
the  number  of  transmissions  increases  with  time  and  the  throughput 
decreases  with  time  until  zero  throughput  is  obtained.  This  implies 
another  problem  in  random  access  schemes  and  that  is,  the  control 
of  the  offered  rate  and  retransmission  strategies  so  as  to  obtain 
the  highest  possible  channel  utilization  [5],  [8].  It  is  clear  how- 

ever, that  retransmissions  have  to  be  randomized  in  time  since  other- 
wise once  a collision  between  two  devices  occurred,  it  will  persist. 

The  first  packet  radio  channel  system  where  devices  use  a random 
access  scheme,  known  to  us,  was  analyzed  and  implemented  at  the  Uni- 
versity of  Hawaii  [1].  The  random  access  scheme  used  is  the  so  called 
"pure"  or  "unslotted"  ALOHA.  In  this  scheme,  every  terminal  transmits 
its  packets  independent  of  any  other  terminal  or  any  specific  time. 
That  is,  the  terminal  transmits  the  whole  packet  at  a random  point  in 
time;  the  terminal  then  times  out  for  receiving  an  acknowledgement. 

If  an  acknowledgement  is  not  received,  it  is  assumed  that  a collision 
occurred  and  the  packet  is  retransmitted  after  an  additional  random 
waiting  time.  Abramson  [1]  obtained  that  the  capacity  (effective)  of 
the  channel  is  l/2e  of  the  given  capacity  when  the  number  of  terminals 
is  very  large  and  when  the  point  process  of  the  beginning  of  packet 
transmission  onto  the  channel  is  Poisson. 

It  was  realized  that  a gain  in  capacity  can  be  obtained  if  the 
channel  was  slotted  into  segments  of  time  whose  duration  is  equal  to 
the  packet  transmission  time,  and  when  terminals  are  required  to  begin 
the  transmission  at  the  beginning  of  a time  slot.  The  access  scheme 
is  random  in  the  sense  that  terminals  transmit  into  a random  slot  in 
time  and  retransmit  after  waiting  a random  number  of  slots.  This 
scheme  is  called  "slotted  ALOHA."  Roberts  [6]  has  shown  that  the  cap- 
acity of  this  scheme  is  1/e  of  the  given  capacity,  using  the  same 
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assumptions  as  Abramson.  The  slotted  ALOHA  random  access  scheme  was 
further  analyzed  in  [2],  [3],  [4],  and  [5]  for  the  case  of  a small 
number  of  terminals  and  when  there  is  a mixture  of  traffic  from  a 
small  number  of  "big"  users  and  a large  number  of  "small"  users. 
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II.  PROBLEM  DESCRIPTION 


In  a network  context,  the  analyses  of  all  the  references  address 
a "single  hop  network."  That  is,  when  considering  a terrestrial  sys- 
tem, it  is  implicitly  assumed  that  all  devices  are  within  an  effective 
distance  of  each  other;  the  same  is  true  when  considering  a satellite 
channel,  since  the  satellite  echoes  the  packets  to  the  destination 
station.  Such  a network  can  be  described  as  one  in  which  there  is  a 
single  receiver  and  many  transmitters,  all  of  which  are  within  an 
effective  transmission  range  to  the  receiver,  and  to  each  other. 

In  cases  in  which  the  transmission  range  of  terminals  is  not  suf- 
ficient to  reach  the  destination  receiver,  it  is  necessary  to  intro- 
duce another  device  which  will  receive  the  packets  from  the  terminals 
and  repeat  them  to  the  final  destination.  Such  a network  can  be  used 
for  local  distribution  and  collection  of  traffic,  in  which  case  the 
station  is  a gateway  to  a point-to-point  network.  We  particularly 
consider  a 2-Hop  network  model  in  which  there  is  a large  number  of 
terminals  in  the  neighborhood  of  each  repeater  and  that  the  trans- 
mission range  of  terminals  is  short  so  that  a terminal  can  reach  only 
one  repeater,  as  shown  in  Figure  1.  Our  model  can  be  useful  as  a 
distribution  model  for  a suburban  area,  where  instead  of  supplying 
each  terminal  with  a powerful  transmitter',  we  allocate  repeaters 
which  collect  the  traffic  from  terminals. 

In  a military  application,  one  can  think  of  a large  unit  which 
uses  such  a 2-Hop  network,  where  each  subunit  (the  terminals  of  which 
are  relatively  close  to  each  other  and  may  be  mobile)  has  its  own  re- 
peater, and  the  station  is  at  the  Headquarters.  In  fact,  the  network 
can  be  operative  when  the  whole  unit  is  moving,  providing  each  suounit 
carries  its  repeater  or  station  along  with  it,  e.g.,  a fleet  of  ships. 
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In  th  113  "etWOrk  Can  alSO  model  a satellite  case  as  shown  in  Figure  2. 
n the  analyses  of  a satellite  channel  reported,  it  was  assumed  that 

terminal  is  a ground  station  which  originates  its  traffic  or  which 
connected  to  a point-to-point  network . However,  suppose  there  are 
clusters  of  (possibly  mobile)  terminals,  where  in  each  cluster,  they 

mote  c 3^1V^ly  Cl°Se  '"°  each  other'  and  wish  to  communicate  with  a re- 
mote central  computer  installation.  Then,  one  can  devise  a Ground 

tion'reoeT3^1  * CO"UniCate  "ith  the  terminals  and  the  ground  sta- 
tion repeater  may  use  a satellite  channel  to  transmit  (second  hop)  to 
the  computer  installation. 

operationrIfWthSPeClfy  ^ Pr°blems  to  be  "Pressed,  we  comment  on  the 
operation  of  the  repeater.  We  have  indicated  that  a terminal  retrans- 

ment  IhTl  ^ 3 timS  °Ut  lf  ^ d°eS  n0t  receive  an  acknowledge- 

ment.  The  repeater  can  operate  in  a very  simple  manner  in  which  it 

repeats  only  once  a packet  that  it  correctly  receives.  In  this  case, 

terminal  has  to  time  out  for  a longer  period  of  time  to  wait  for 

an  acknowledgment  from  the  station  (end-to-end  ack,  . Alternatively, 

on  the  Vs"  ^ ^ reSP°nSible  £°r  successful  transmission 
on  tne  second  hop,  by  acknowledging  the  terminal,  storing  the  packet. 

It”?  14  Until  U reCeiveS  a"  acknowledgement  from  the  sta- 
in " (h°P-by-hOP  a^n°wledgement)  . This  problem  has  been  considered 

is  I!  ^ V'aS  Sh°Wn  that  3 hoP-by-h°P  acknowledgement  operation 
re  e lcient.  Thus,  we  will  assume  a repeater  of  that  type. 

e irst  problem  that  we  address  is  to  determine  the  network 
capacity  as  a function  of  the  number  of  repeaters  and  the  interference 
between  repeaters.  Another  problem  of  interest,  is  to  determine  the 
capacity  bottleneck  (critical  hop  );  i.e.,  whether  the  capacity  bottle- 
neck is  on  tne  hop  from  terminals  to  repeaters  or  on  the  hop  from  re- 
peaters  to  station.  The  system  is  assumed  to  have  two  channels; 

annel  is  used  for  transmission  from  terminals  via  repeaters  to 
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the  station  and  the  second  channel  for  transmission  from  the  station 
via  repeaters  to  the  terminals.  The  traffic  of  acknowledgement  packets 
is  not  considered;  which  implicitly  assumes  that  there  is  a separate 
channel  for  acknowledgements.  The  two  channels  are  analyzed  separately 
and  design  questions  such  as  the  following  are  considered:  Is  it  use- 

ful to  have  directional  antennas  at  repeaters  when  transmitting  to  the 
station  ? This  question  is  relevant  in  the  terrestrial  system,  since 
in  the  satellite  system,  the  ground  station  repeaters  will  presumably 
be  out  of  range  from  each  other,  and  use  directional  antennas.  Sim- 
ilarly, in  a terrestrial  system,  one  may  ask  about  the  usefulness  of 
using  a directional  antenna  at  the  station  when  transmitting  to  re- 
peaters. Other  questions  relate  to  the  possibility  of  using  several 
transmitters  and  antennas  at  the  station  when  transmitting  to  repeaters. 


SATELLITE 


FIGURE  2:  A SATELLITE  SYSTEM 
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III.  TRANSMISSION  FROM  TERMINALS  TO  STATION 


Consider  a system  where  m repeaters  receive  packets  from 
terminals  and  repeat  the  packets  to  a single  station,  as  shown 
in  Figure  3.  We  donote  by  G and  S the  rate  of  packet  transmission 
per  slot  and  the  rate  of  successful  packet  transmission  per  slot, 
respectively.  Specifically,  let  and  S^  be  the  rates  of  trans- 
mission from  terminals  to  repeater  i,  and  G2^  and  S2^  the  rates  from 
repeater  i to  the  station.  We  wish  to  obtain  the  probability  that 
a repeater  is  idle. 

A single  hop  network  is  the  case  in  which  a set  of  terminals 
transmit  to  a repeater  and  the  repeater  is  the  final  destination 
and  does  not  repeat  the  packets.  Thus,  the  probability  that  a 
transmission  from  a terminal  to  the  repeater  is  successful  is  the 
probability  that  no  other  terminal  transmits  into  that  same  slot. 

That  is,  if  we  assume  that  a packet  is  transmitted  into  the  first 
slot  after  it  becomes  ready  for  transmission  then  the  probability  of 
success  is  the  probability  that  no  new  packet  has  arrived  and  that 
no  other  packet  has  been  scheduled  for  retransmission  in  the  interval 
of  time  of  the  preceeding  slot.  In  the  network  case  however,  a 
transmission  from  a terminal  to  a repeater  will  not  be  successful 
also  in  the  case  when  the  repeater  uses  the  same  slot  for  transmitting 
to  the  station,  or  if  another  repeater,  within  an  effective  trans- 
mission range  of  the  first  and  which  uses  an  omnidirectional  antenna, 
uses  the  same  slot  for  transmission  to  the  station. 

Throughout  the  chapter,  we  use  the  following  assumptions.  The 

combined  point  process  of  packet  origination  and  packets  scheduled 

for  retransmission,  from  each  set  of  terminals  to  a repeater,  is 

Poisson.  Thus,  the  probability  of  no  arrival  during  a slot  time  x 
-GUT 

is  e ; we  use  x=l.  The  probabilities  of  transmission  by  a re- 

peater into  different  slots  are  independent.  The  probability  of 
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transmission  by  two  or  more  repeaters  into  a randomly  chosen  slot 
are  mutually  independent;  and  the  probability  of  transmission  into 
a random  slot  by  a terminal  and  by  a repeater  are  independent.  The 
assumption  of  a Poisson  distribution  for  the  process  of  packet 
origination  plus  retransmission  has  been  questioned  in  previous 
publications.  The  validity  of  this  assumption  is  important  when 
one  considers  packet  delays,  since  if  one  assumes  a Poisson  point 
process  for  packet  originations,  then  the  assumption  that  the 
combined  process,  of  origination  plus  retransmissions,  is  Poisson 
is  valid  only  when  one  allows  very  large  packet  delays.  In  this 
chapter,  we  do  not  require  finite  delays  and  our  interest  is  in  the 
ultimate  system  capacity. 

Finally,  let  denote  the  set  of  repeaters  which  have  an 
effective  transmission  range  to  repeater  i;  Q.  includes  repeater  i. 
Under  these  assumptions  we  can  write : 

-G.  . 

pr [repeater  i is  idle]  * e ..JSq.  (1-g2_.)  (1) 

Similarly,  the  probability  that  the  station  is  idle  can  be 
written  as: 

m 

Pr  [station  is  idle]  = (1-g2;.)  (2) 

We  now  make  a few  assumptions  which  simplity  the  computation 
but  enable  us  to  answer  the  questions  of  interest.  Specifically, 
we  assume  that  = G^,  S^  = S^/m  for  all  i,  that  repeaters  share 
equally  the  load  so  that  G2i  = G2/m  and  S2i  = S2/m  for  all  i,  and 
that  all  repeaters  have  the  same  number  of  interfering  repeaters, 
n (Q^)  = I for  all  i.  We  refer  to  I as  the  interference  level. 


! 
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We  can  now  write  the  throughput  equations  for  hop  1 and  hop  2 : 


S 


1 


m G«  I — G.  G~  X - G 

l Gi  ^ e - mGi  (1-5r>  e 


m 


G-  m-1 

(1-5r> 


m-1 


(3) 

(4) 


Note  that  when  m=l  all  transmissions  from  the  repeater  to  the 
station  are  successful  since  it  does  not  interfere  with  its  own 
transmissions . 

If  a repeater  were  a traffic  source  and  a traffic  sink,  then 
G^  and  G2  could  have  been  considered  as  independent  variables.  In 
our  case,  however,  the  intensity  of  the  processes  on  the  two  hops 
are  related.  In  particular,  we  consider  traffic  rates  in  which  the 
system  can  operate  at  steady  state.  That  is,  if  one  observes  the 
system  for  a long  period  of  time,  then  all  the  packets  which  success- 
fully arrive  to  repeaters  also  successfully  arrive  to  the  station. 
Thus  we  can  use  the  conservation  law  at  repeaters,  namely  S1  = S2> 
This  results  in  the  relation: 

G9  I+I-m  -G, 

G2  = mG1  (1-  £)  e (5) 


One  can  now  study  the  system  performance  as  a function  of  parameters 
m and  I,  with  one  independent  variable. 


A.  Complete  Interference  System,  I = m 

This  case  is  applicable  to  terrestrial  networks  in  which 
repeaters  are  either  placed  relatively  close  to  each  other  or  use 
powerful  transmitters;  either  of  which  results  in  the  interference 
among  all  repeaters.  For  this  case  we  obtain: 


m G^  e 

1+Gje  "G1 


(6) 
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and 


m G,  e 


S1  = 


-G.  m 

(1+G1  e ±) 


(7) 


The  capacity  of  this  system  is  given  by  the  maximum  of  S... 


d S 


-G. 


1 _ m e 


d G. 


(1+G1  e 


-G. 


m+1 


(1-G1)  [l-(m-l)  G1  e 


-G. 


] = 0 


(8) 


By  examining  (8),  one  finds  that  there  is  one  stationary  point,  a 


maximum  at  G1  - 1,  when  m<4.  When  m>4  there  are  three  station- 


ary points;  a minimum  at  G^  - l,  and  two  maximum  points  of  the 
same  value  at  G which  satisfies  1-  (m-1)  G e~G  = 0.  Substitu- 
ting these  values  into  (7)  , one  obtains  the  capacity  of  this  (com- 
plete interference)  network  as  a function  of  m: 


m 


Network  Capacity  = = 


e (l+i-)m 
e 


m<4 


(9) 


(l-^_)m-1 

m 


m>.4 


Notice  that  the  network  capacity  is  lower  than  the  capacity  of  a 
single  hop  network,  1/e,  when  m=l;  it  is  higher  than  1/e  for  m>l, 
and  tends  to  1/e  in  the  limit  when  m-*00. 

Figure  4 shows  the  network  throughput  S1  as  a function  of 
the  rate  of  transmission  from  terminals  to  a single  repeater. 

It  is  interesting  to  observe  the  rate  of  change  of  S with  respect 
to  G.  For  example,  when  M is  large  (e.g.  m=8)  then  the  system 
becomes  more  sensitive  to  variations  in  the  value  of  G.  On  the 
other  hand,  there  are  values  of  m,  for  example,  m = 2 or  3 in  the 
complete  interference  case,  for  which  the  maximum  throughput  re- 
gion is  quite  flat. 
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B.  The  Critical  Hop 

To  determine  the  critical  hop,  one  has  to  obtain  the 

capacity  of  the  two  hops  of  the  network.  The  capacity  of  the 

hop  from  repeaters  to  the  station  is  independent  of  and  I 

(see  Eq.  (4)  ) , and  is  given  by  (l-l/m)m  The  capacity  of 

the  hop  from  terminals  to  repeaters,  on  the  other  hand,  depends 

on  G2  and  I.  Note  that  0<G2^m;  G2>m  is  not  realizable  since  we 

assume  that  a repeater  has  one  transmitter  and  cannot  transmit 

more  than  one  packet  per  slot.  For  any  G2  in  the  above  range, 

the  capacity  of  hop  1 (see  Eq.  (3)  ) increases  with  m. 

Thus,  there  exists  an  m for  which  the  capacity  is 

o 

higher  than  that  on  the  hop  from  repeaters  to  station,  and  the 

latter  becomes  the  critical  hop.  Furthermore,  mQ  depends  on  I, 

and  for  I >In  , m (I„)>ti  (I,).  Thus,  for  m>m  the  critical  hop 

2 1 o 2 o 1 o 

is  that  from  repeaters  to  station,  and  for  ni£mQ  the  critical  hop 

is  from  terminals  to  repeaters.  For  example,  for  I=m  (see  Eq. (9)) 

m =4,  and  for  I = m-1,  m =3. 
o o 

Perhaps  a more  direct  way  to  answer  the  question  of  the 
critical  hop  is  to  obtain  the  system  capacity  as  a function  of  m 
and  I.  Then,  whenever  the  capacity  is  smaller  than  ( 1-1/m) m ^ 
the  critical  hop  is  that  from  terminals  to  repeaters  and  when  the 
capacity  equals  this  expression,  the  critical  hop  is  from  repeat- 
ers to  station.  However,  it  is  difficult  to  obtain  a closed  form 
solution  in  the  general  case. 

C . Number  of  Repeaters  and  Directional  Antennas  at  Repeaters 
The  considerations  of  directional  antennas  at  repeaters 

apply  only  to  the  terrestrial  system.  In  the  satellite  system, 
ground  station  repeaters  will  use  directional  antennas  when  trans- 
mitting to  the  satellite  and  omnidirectional  antennas  when  communi- 
cating with  terminals. 
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The  effect  of  directional  antennas  at  repeaters  in  the 
terrestrial  system  is  that  the  transmission  from  repeaters  to  the 
station  is  directed  towards  the  station  and  does  not  interfere 
with  the  transmission  of  terminals  to  other  repeaters.  Thus,  it 
is  the  special  case  with  1=1.  We  notice,  however,  that  direc- 
tional antennas  do  not  increase  the  capacity  of  the  hop  from  re- 
peaters to  station  because  all  antennas  are  directed  towards  the 
same  physical  location  where  the  station  is  placed  and  where  the 
conflicts  may  occur. 

Figure  5 shows  the  capacity  of  the  system  as  a function 
of  m,  for  I=m  and  1=1,  which  is  equivalent  to  omnidirectional  and 
directional  antennas  (or  satellite  system)  respectively.  One  can 
see  that  there  is  a gain  in  capacity  when  using  directional  an- 
tennas only  when  m=2,  and  a small  gain  for  m=3.  For  m>4  the  ca- 
pacity does  not  increase  because  the  critical  hop  is  between  the 
repeaters  and  the  station,  so  that  it  does  not  matter  how  much 
one  can  get  through  from  terminals  to  repeaters. 

As  far  as  the  number  of  repeaters  is  concerned,  one  can 
see  from  Figure  5 that  the  maximum  system  capacity  is  obtained 
when  m=2  in  the  non-interference  case  and  when  m=3  in  the  complete 
interference  case.  Thus  2 or  3 repeaters  would  be  a good  design; 
any  additional  repeaters  that  may  be  added  because  of  other  con- 
siderations (such  as  area  coverage)  will  result  in  a reduction  in 
the  system  capacity. 
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IV.  TRANSMISSION  FROM  STATION  TO  TERMINALS 

In  this  section,  we  consider  the  second  channel  which  is  used 
for  transmission  from  the  station  to  terminals  via  repeaters.  In 
the  terrestrial  system,  it  is  assumed  that  the  effective  transmission 
range  of  the  station  is  such  that  it  interferes  with  the  trans- 
mission from  repeaters  to  terminals,  as  shown  in  Figure  6.  However, 
we  assume  that  terminals  cannot  directly  receive  from  the  station 
or  from  the  satellite  (otherwise,  it  becomes  a single  hop  network 
and  the  capacity  is  1).  We  use  the  notation  shown  in  Figure  6, 
where  the  first  hop  is  that  from  the  station  to  repeaters  and  the 
second  hop  is  from  repeaters  to  terminals. 

We  use  similar  assumptions  to  the  ones  made  in  the  previous 
section.  Specifically,  we  assume  that  the  probabilities  of  trans- 
mission by  a repeater  into  different  time  slots  are  independent, 
that  the  probability  of  transmission  of  two  or  more  repeaters  into 
a randomly  chosen  slot  are  mutually  independent,  and  that  the  pro- 
bability of  transmission  by  the  station  and  by  repeaters  into  a 
random  slot  are  mutually  independent.  Finally,  we  simplify  by 
assuming  that  repeaters  share  equally  the  load;  by  which  we  mean 
that  = ^2^  and  = G^/m,  for  all  i.  The  equations  which 

relate  the  rate  of  transmission  to  the  rate  of  successful  trans- 
mission on  the  two  hops  can  now  be  written: 


= 


m 

l 

1 


m 


(1  - G1)  (1 


3i)J  “ 1 = g2  {1  - G1>  (1  - Hr)1  " 1 <10> 


s,  = 


G1  (1 


m 


V 


an 


I is  the  interference  level  as  in  the  previous  section. 

For  consistency  with  the  interference  model  of  the  previous 
section,  we  remark  the  following.  Eqs.  (10)  and  (11)  are  for  the 
case  in  which  the  same  energy-per-bit-to-noise-density  is  required 
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for  detection  with  equal  error  rates,  by  the  repeater  and  by  the 
terminal;  and  when  the  repeater  used  a higher  transmitter  power.  In 


the  more  general  case,  one  has  to  replace  I in  (10)  by  IT  which  will 


designate  the  number  of  sets  of  terminals  with  which  a transmission 


from  a repeater  may  interfere;  and  replace  I in  (11)  by  I_  which  will 

R 


indicate  the  number  of  repeaters  with  which  a transmission  from  a 
repeater  may  interfere. 

We  assume  again  steady  state  system  operation  and  use  the  con- 
servation law  at  repeaters,  i.e.,  S.^  = S2.  This  results  in 


G2  = 


1 - G 


m - 1 


One  can  now  substitute  (12)  into  (10)  and  (11)  to  obtain  the  through- 


put as  a function  of  one  independent  variable.  For  S^,  one  obtains: 


‘'-"‘(rHrn)' 

' 1 m • 


To  obtain  the  capacity  one  can  maximize  either  S1  or  S2>  Equating 


dS^/dG^  to  zero,  we  obtain  that  there  are  three  stationary  points. 


A minimum  at  G^  = 1 and  two  maximum  values  at 


„2,3  _ (2m  + I - 1)  ± ▼ 4ml  + (I  - 1) 
1 = 2 (m  - 1) 


We  now  examine  the  constraints.  For  the  system  to  be  realizable. 


0£G^<^1  ; 0 <_  G2  £ m ; 1 £ I £ m 


From  Equation  (14),  one  can  see  that  the  plus  sign  results  in 


G^  > 1,  since 


2m  + I - 1 + f 4ml  + (1-1;  > 2m  + 2(1  - 1)  > 2 (m  - 1) 

(16) 


Furthermore,  from  Equation  (12),  one  can  see  that  when  0 £ G1  £ If 


then  0 <_  G2  <_  m.  Thus,  the  only  realizable  maximum  is  given  by 
Equation  (13)  with  G^  as  in  Equation  (14)  when  taking  the  minus 
sign  of  the  square  root. 
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Figure  7 shows  the  system  throughput  as  a function  of  G2 
for  m = 6 and  I as  a parameter.  One  can  see  that  there  is  a 
high  degradation  in  system  performance  when  the  interference  level 
I increases.  Figure  8 shows  the  capacity  of  the  system  as  a func- 
tion of  the  number  of  repeaters  m for  the  non-interference  (I  = 1) 
and  the  complete  interference  (I  = m)  cases.  When  1 < I < m,  the 
curve  will  be  between  the  two  shown.  It  can  be  seen  that  there 
is  a large  difference  in  system  capacity  between  the  non-inter- 
ference and  the  complete  interference  systems,  and  that  this  dif- 
ference increases  with  the  number  of  repeaters  m.  For  1=1,  the 
capacity  of  the  system  is  (m  (v'IT-  l)2)/(m  - l)2  which  tends  to 
1 when  m tends  to  infinity. 

In  the  terrestrial  system,  the  interference  level  depends  on 
the  transmission  power  of  the  repeaters  when  transmitting  to 
terminals.  Thus,  it  would  be  advantageous  to  use  as  low  trans- 
mission power  as  possible  sufficient  to  reach  the  terminals,  or 
possibly  an  adaptive  power  mechanism. 


Directional  Antennas  and  Multiple  Transmitters  at  the  Station 
This  section  is  addressed  only  to  a terrestrial  system. 

When  the  station  uses  a directional  antenna,  then  its  trans- 
mission to  repeater  R^  does  not  interfere  with  R.,  j ^ i.  Con- 
sequently, the  average  rate  of  transmission  per  slot  to  a single 
repeater  is  G^m  (assuming  equal  share  of  load)  . The  only  change 
that  would  result  in  Equations  (10)  and  (11)  is  the  replacement 
of  Gj|  in  Equation  (10)  by  G^/m.  Doing  so  and  equating  and  S 
results  in  Gj^  = G2  and: 


’1^  I 


S1  " Gl(1  " i-5  ; 0 1 Gi  1 1 ? 1 1 I 1 


m 


’2  > I 


S2  G2  d in- ^ ' 0 £ G2  — m ' ^ .1  1 £ m 


(17) 

(18) 
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The  capacity  of  the  system  is  given  by 


m 


y (1  - y j-)  * ; m <_  I + 1 


I + 


l d “ 1 ? «»  > 1 + 1 


(19a) 


(19b) 


Figure  9 shows  the  capacity  of  the  system  as  a function 
of  m for  1=1  and  I = m.  The  same  curves  for  an  omni-direc- 
tional antenna  are  shown  as  a reference.  It  can  be  seen  that 
the  capacity  of  the  system  with  a directional  antenna  is  sub- 
stantially higher;  in  particular,  when  the  interference  level 
is  low. 

We  now  address  the  question  of  multiple  transmitters 
and  antennas  at  the  station.  Consider  the  capacity  of  the 
system  with  a directional  antenna  at  the  station,  Equation 
(19) . The  maximum  given  by  (19a)  is  a stationary  point 
whereas  that  given  by  (19b)  is  a boundary  point  at  = 1. 
Also  note  that 


— — — (1  - — - — J1  > (1  - i)1,  for  m < I + 1 (20) 

I + 1 I + 1 m 

Thus,  if  one  increases  the  domain  of  G1,  it  would  result 
in  an  increase  in  system  capacity,  which  will  then  be  given 
by  Equation  (19a) . This  is  exactly  what  happens  when  adding 
additional  transmitters  and  antennas  to  the  station.  It 
enables  the  station  to  transmit  more  than  one  packet  into 
the  same  "slot  in  time"  and  direct  the  packets  to  different 
repeaters.  In  practice,  if  the  station  has  several  trans- 
mitters and  directional  antennas  which  enables  it  to  trans- 
mit simultaneously  in  different  directions,  then  one  can 
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devise  an  algorithm  at  the  station,  which  will  properly 
select  the  directions  to  which  packets  are  simultaneously 
transmitted,  so  as  to  further  reduce  the  interference 
level  I (manage  its  transmissions). 

Figure  10  shows  the  system  throughput  as  a function 
of  the  rate  of  transmission  from  station  to  repeaters  (or 
from  repeaters  to  terminals,  note  G1  = G2).  One  can  see 
that  for  1=1  and  1=3,  the  maximum  system  capacity 
cannot  be  obtained  with  a single  transmitter  and  the  value 
obtained  is  at  the  boundary  point  at  Gj  ■ 1 and  given  by 
(19b).  Notice  that  when  the  number  of  repeaters  is  large 
and  the  interference  level  is  low,  then  there  is  a large 
difference  between  the  maximum  capacity  and  the  constrained 
maximum  capacity. 

We  now  determine  the  minimum  number  of  transmitters 
and  directional  antennas  needed  at  the  station.  If  the 
interference  level  I is  constant  then  the  unconstrained 
capacity  of  Equation  (19a)  is  increasing  with  the  number 
of  repeaters  m.  Moreover,  the  capacity  increases  also  in 
the  case  that  I is  a linear  function  of  m.  For  let  I = km, 
1/m  1 k £ 1,  then  the  unconstrained  capacity  is  given  by 


S* 


m 

km  + 1 


(1 


1 . km 

km  + 1 


(21) 


and 


= km  n _ 1 x km  - 1 . _ 

dm  (km  + l)2  km  + 1}  1 °l  for  k > 0 (22) 

Since  the  capacity  is  increasing  as  a function  of  m,  one 
can  obtain  a capacity  greater  than  1 (see  for  example  Figure 
10,  1=1).  The  minimum  number  of  transmitters  that  can 
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realize  the  unconstrained  capacity  is  given  by  the  rate 
of  transmission  from  the  station  to  repeaters  at  which 
the  maximum  utilization  is  obtained.  That  is: 

r nr  ~i 

Minimum  Number  of  Transmitters  = I -"-y  (23) 

r i 

where  x is  the  smallest  integer  greater  than  x.  Equa- 
tion (23)  also  implies  that  multiple  transmitters  will  not 
result  in  an  increase  in  system  capacity  when  the  inter- 
ference level  is  .high,  i.e.,  m <_  km  + 1.  It  is  easy  to 
verify  that  when  the  constraint  on  is  satisfied,  the 
constraint  on  G2  is  also  satisfied. 

By  associating  a cost  value  with  a repeater  and  with 
each  additional  transmitter  at  the  station,  one  can  formulate 
a design  optimization  problem  in  which  the  system  cost  is 
traded  against  the  increase  in  capacity  which  it  results. 

The  results  of  this  section  demonstrate  that  a single 
slotted  ALOHA  channel  can  be  used  (and  reused)  spatially  to 
obtain  channel  utilization  higher  than  100%. 
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CONCLUSIONS 


Our  conslusions  from  the  analysis  are  outlined. 

A*  Transmission  to  Station 

1.  When  the  number  of  repeaters  is  small  (the  exact 
number  depends  on  the  interference  level) , then  the 
critxcal  hop  in  the  network  (the  capacity  bottleneck) 
is  from  terminals  to  repeaters. 

2.  When  the  number  of  repeaters  is  large,  the  critical 
hop  is  from  repeaters  to  station. 

3.  The  2-Hop  network  design  which  maximizes  the  sys- 
tem capacity  has  2 repeaters  when  the  interference  is 
minimum,  and  3 repeaters  when  the  interference  is  max- 
imum. Additional  repeaters  reduce  system  capacity. 

4.  The  capacity  of  a 2-Hop  network  is  higher  than 
that  of  a 1-Hop  network  when  the  number  of  repeaters 
is  2 or  more,  and  is  lower  than  the  capacity  of  a 
1-Hop  network  when  there  is  one  repeater. 

Directional  antennas  at  repeaters  increase  system 
capacity  when  the  critical  hop  is  from  terminals  to 
repeaters.  The  increase  is  significant  only  in  the 
case  when  there  are  2 repeaters,  which  do  otherwise 
interfere  with  each  other.  In  other  cases,  the  in- 
crease in  capacity  is  either  insignificant  or  does 
not  exist. 
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B . Transmission  from  Station 

1.  The  interference  of  the  station  with  the  trans- 
mission of  repeaters  to  terminals  reduces  significantly 
the  system  capacity.  Thus,  if  possible,  it  is  important 
to  enable  terminals  to  receive  such  transmissions. 

2.  The  system  capacity  reduces  substantially  when 
the  interference  level  between  repeaters  is  increased. 
Note  that  this  is  not  the  case  when  transmitting  to 
the  station;  compare  Figures  5 and  8.  Consequently, 
in  a terrestrial  system,  it  is  important  to  reduce 
the  interference  factor  by  a mechanism  such  as  adap- 
tive power. 

3.  A directional  antenna  at  the  station  in  a terres- 
trial system  increases  significantly,  the  system  cap- 
acity when  the  interference  level  between  repeaters 

is  low  to  moderate.  This  is  not  the  case  when  the 
interference  level  is  high,  since  the  throughput  on 
the  hop  from  repeaters  to  terminals  is  limited  due  to 
this  interference. 

4.  When  the  station  has  directional  antennas,  then 
multiple  transmitters  and  antennas  may  further  increase, 
significantly,  system  capacity.  Note  that  in  this 
case,  one  can  obtain  a capacity  greater  than  1. 

5.  An  equation  for  the  number  of  transmitters  needed 
at  the  station  is  given.  This  number  increases  when 
the  interference  level  decreases.  When  the  interference 
level  is  high,  then  none  of  the  devices  in  conslusions 
B. 3 , B. 4 , B . 5 , are  desirable,  since  the  capacity  is 
limited  by  the  throughput  from  repeaters  to  terminals. 


| 
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REMARKS : 

Conclusions  A. 5 and  B.3  which  relate  to  directional  antennas, 
imply  that  directional  antennas  are  generally  not  useful  when  devices 
which  use  them  direct  transmissions  to  a single  location  in  space; 
because  the  interference  at  this  location  is  not  avoided.  On  the 
other  hand,  directional  antennas  are  very  useful  when  oriented  to 
different  azimuths,  because  one  can  take  advantage  of  the  spatial  dis- 
tricution  of  receivers. 


'' 
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PACKET  RADIO  SYSTEM  CONSIDERATIONS  - CHANNEL  CONFIGURATION 


I.  INTRODUCTION 


Consider  a communication  channel  which  is  shared  by  two  in- 
dependent sources  of  traffic  in  a broadcast  mode.  Source  1 is 
generated  by  an  infinite  number  of  terminals,  each  with  an  infin- 
itesimal traffic  rate,  and  which  collectively  form  a finite  Poisson 
source.  Source  2 is  generated  by  a finite  number  of  terminals 
each  with  a finite  traffic  rate.  The  terminals  transmit  fixed 
size  packets  and  access  the  channel  using  the  so  called  "slotted 

ALOHA"  random  access  scheme.  A terminal  can  transmit  to  any  other 
terminal  in  the  system. 

This  model  can  describe  the  ALOHA  system  at  the  University  of 
Hawaii  [1]  or  the  Packet  Radio  System  [6].  In  the  Packet  Radio 
System  there  is  a large  number  of  terminals  which  communicate 
with  a small  number  of  stations.  The  terminals  can  be  modelled 
by  the  terminals  of  Source  1 and  the  stations  by  the  terminals 
of  Source  2.  The  model  is  suitable  for  a Facket  Radio  System 
in  an  urban  area  where  a terminal  can  directly  transmit  to  a 

station  and  where  any  transmission  from  a terminal  or  station  inter- 
feres with  all  other  devices. 

Roberts  [5]  has  shown  that  if  all  the  traffic  is  contributed 
by  Source  1,  then  the  maximum  throughput  which  can  be  obtained 
is  1/e  of  the  channel  capacity.  Abramson  [2]  and  Kleinrock  and  Lam 
[3]  have  shown  that  the  maximum  throughput  can  be  increased  when 

the  traffic  is  composed  of  contributions  from  both  Source  1 and 
Source  2. 

We  approach  the  problem  from  a synthesis  viewpoint.  That  is, 
given  Source  1 and  Source  2,  the  question  is  whether  one  should 
split  the  channel  so  that  one  part  is  used  by  Source  1 and  the 
other  part  by  Source  2;  or  alternatively,  should  one  use  the  total 
capacity  in  common.  The  criteria  for  decision  are  maximum  through- 
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put  and  average  delay.  We  consider  a channel  split  in  which  the 
slots  are  partitioned  between  transmissions  from  Source  1 and 
Source  2,  however,  all  terminals  receive  on  all  time  slots.  It 
is  shown  that  the  choice  of  channel  configuration  depends  on  the 
number  of  terminals  in  Source  2,  n,  and  on  the  ratio  of  packet 
rate  of  Source  2 to  Source  1,  designated  by  a.  Further,  given  n, 
there  is  an  interval  of  a for  which  a higher  maximum  throughput 
can  be  obtained  by  splitting  the  channel. 

The  problem  considered  in  this  chapter  was  addressed  in  [7] 
for  n=l  but  for  several  random  access  schemes;  specifically,  for 
the  nonslotted  and  slotted  ALOHA  and  for  the  carrier  sence  [ 8 j 
random  access  schemes.  The  qualitative  conclusions  of  [7]  are 
the  same  as  in  this  chapter. 
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11  • PRELIMINARY  ANALYSIS 

A-  The  Slotted  ALOHA  Channel 

In  the  slotted  ALOHA  random  access  mode,  a channel  is 
partitioned  into  segments  of  time  (slots)  equal  to  a packet 
transmission  time.  Terminals  transmit  their  packets  into  random 
slots  in  time.  That  is,  there  is  no  coordination  among  the 
terminals  as  far  as  the  choice  of  the  slot  is  concerned;  however, 
there  is  a universal  clock  which  enables  each  terminal  to  start 
the  transmission  of  its  packet  at  the  beginning  of  a slot.  If 
two  or  more  packets  are  transmitted  in  the  same  slot,  it  is  assumed 
that  none  of  the  packets  are  correctly  received  and  each  of  the 

terminals  will  retransmit  its  packet  at  some  randomly  chosen  future 
slot. 

One  can  see  that  the  number  of  packets  transmitted  (the 
channel  traffic)  is  larger  than  the  number  of  packets  offered  to 
the  system  due  to  the  retransmissions  of  packets  which  collide. 

Let  S denote  the  rate  of  packet  originations  per  slot  offered  to 
the  channel,  and  G the  rate  per  slot  of  packets  plus  retransmissions. 
Assume  that  the  two  origination  processes  are  Poisson.  Further 
assume  that  the  probability  that  a packet  is  blocked,  given  the 
packet  is  new,  equals  the  probability  that  a packet  is  blocked 
given  it  is  a retransmission  [3]. 

B*  The  Single  Source  Case 

If  all  the  traffic  is  contributed  from  Source  1,  Roberts 
[5]  has  shown  that  the  relation  between  S±  and  is  given  by: 


Abramson  [2]  considered  the  case  where  all  the  traffic  is  contri- 
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buted  from  Source  2 of  n identical  terminals  and  has  shown  that 


S2  = °2  (1“S>  n_1 


(2) 


One  can  see  that  Eq.  (2)  takes  the  form  of  Eq.  (1)  when  n 
The  maximum  values  of  S1  and  S2  are  given  by: 


* 1 
Sn  = - 
1 e 


S2  = (1_^  n_1 
z n 


(3) 


The  operation  of  the  slotted  ALOHA  channel  may  become  unstable  [4] . 

In  this  paper  we  consider  the  steady  state  case,  in  which  the  offered 
rate  is  also  the  throughput  per  slot  or  the  utilization  of  the 
channel . 


C.  Mixed  Sources  on  a Common  Channel 

ihe  performance  of  a channel  which  is  shared  by  terminals 
from  Sources  1 and  2 has  been  analyzed  by  Abramson  [2] , and  Klein- 
rock  and  Lam  [3].  The  offered  packet  rate  to  the  channnel  is 

S1  + s2/  and  the  channel  traffic  G + G . The  following  equations 
hold: 


S2 


G,  „ i -G, 
/-»  t -j  2 v n- x 1 
G2  (1-—)  e 


(5) 


9.  4 


- 


Network  Analysis  Corporation 


111 • MAXIMUM  UTILIZATION  OF  THE  SPLIT  AND  COMMON  CHANNELS 

Given  Source  1 and  Source  2,  the  question  is  whether  one 
should  split  the  channel  so  that  one  part  is  used  by  Source  1 
and  the  other  part  by  Source  2;  or  alternatively,  should  one  use 
the  total  capacity  in  common.  In  this  section  we  compare  the  channel 
configurations  in  terms  of  maximum  utilization.  To  obtain  an 
absolute  comparison,  we  introduce  the  parameter  of  the  ratio  of 
packet  rates  of  Source  2 to  Source  1: 


a = 


(6) 


We  shall  use  the  subscripts  c and  s to  denote  a common 
channel  and  a split  channel,  respectively;  and  the  superscript  * 
to  denote  the  optimum  or  maximum  values.  The  total  given  capacity 
will  be  assumed  as  one  unit  and  (C.^  c2)  , C1  + C2  = 1,  will  denote  a 
channel  split  where  the  fraction  C±  is  assigned  to  Source  1 and 
the  fraction  C2  to  Source  2.  Given  an  arbitrary  split  (C^  C^)  , 
the  maximum  utilization  of  the  configuration  is  given  by: 


★ 

S = 
s 


C-^  (1+a)  S^ 


C2(1^)S2 


? a < S2  C2 

* 

S1  C1 


;a  > S2  C2 


(7a) 


(7b) 


S1  C1 


Corresponding  to  and  C2  being  saturated  respectively. 

If  a is  known  one  can  split  the  channel  optimally  to  obtain  the 
highest  maximum  utilization  of  the  total  capacity.  It  can  be 
shown  that  the  optimum  split  (C*,  C*)  satisfies  the  following 
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equation : 


* * 
C2  = aSl 


Corresponding  to  both  channels  saturating  simultaneously. 

If  the  channel  is  optimally  split,  then  the  maximum  utiliza- 
tion of  the  total  capacity  is  given  by: 


* * * * * 

ss  = C1  sx  + C2  S2 


(1+a) (l--)"'1 
n 

/ *i  1.  n-1  . 
e (l-“)  t a 


( 9) 


Note  from  Eq.  (9)  that  when 


* 1 

ol  0,  S -*•  -,  and  when  a -> 


S*  ->  ( 1-- ) n_1 , 
s n' 


The  total  utilization  of  the  common  channel  configuration  is 
given  by  the  summation  of  Eqs.  (4)  and  (5). 


S 

c 


n-1 


(10) 


To  obtain  the  maximum  of  Sc,  one  has  to  maximize  Eq.  (10)  subject 
to  the  constraint  = a.  Alternatively,  we  can  use  the  condition 

of  the  channel  traffic  at  the  maximum  utilization  obtained  by 
Abramson  [2].  Doing  so,  we  obtain: 

* * 


* 

S 

c 


(ID 


where , 


* 


n + g(n+l) 


[n+g (n+1) ] 2-4g^n 

2g 


(12) 


and , 


( 13) 
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Figure  1 shows  the  comparison  between  the  maximum  utilizations 

* * 

of  the  two  configurations  and  Sg.  It  is  shown  as  a function 

of  a with  n as  a parameter.  The  values  of  n used  are  1,  3,  and  °° , 
and  the  split  C1  = 

From  Figure  1,  one  can  see  that  for  a given  n,  there  is  an 

★ > ★ 

interval  of  a such  that  within  this  interval  S - S , and  outside 

* * sc' 

of  the  interval  S„  < S . Furthermore,  the  interval  discussed 

s c 

decreases  when  n increases  and  constitutes  a single  point  when 
n = 
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Iv-  DELAY  CONSIDERATIONS 

It  is  clear  that  when  a is  within  the  interval  discussed  in 
the  previous  section,  the  split  configuration  is  better  than  the 
common  configuration  in  terms  of  maximum  throughput  and  average 
delay,  since  an  infinite  delay  is  obtained  in  the  common  configura- 
tion before  it  reaches  the  maximum  utilization  of  the  split  con- 
figuration. The  average  delay  of  a packet  from  Source  1 will  usually 
be  different  from  that  of  a packet  from  Source  2.  In  particular, 
in  the  split  configuration  when  the  maximum  utilization  is  obtained, 
one  channel  is  saturated  (infinite  delay)  and  the  other  is  not; 

except  for  the  optimum  splitting  a for  which  both  channels  saturate 
simultaneously. 

The  average  delay  in  this  system  is  composed  of  the  delay  when 
the  first  transmission  is  successful  plus  the  average  number  of  re- 
transmissions times  the  average  delay  per  retransmission.  A 
terminal  in  Source  2 also  encounters  queueing  delay. 

We  show  curves  of  delay  vs.  throughput  for  several  parameters 
a.  We  consider  a terrestrial  system  in  which  propagation  delay 
is  ignored,  and  use  all  the  assumptions  given  in  Section  1.  The 

average  delay  equations  in  units  of  slot  times  used  are  the 
following : 

D1  " 1'5  +(^>  * (14) 


D,  = 1+.5  (1 1)  + 


G0/n 


n 2 (l-G2/nj 


G2-S2  - 


where  k is  the  average  waiting  per  retransmission.  The  value  .5 
is  added  to  represent  that  when  a packet  is  ready  for  transmission, 
the  terminal  will  wait  one  half  slot,  on  the  average,  until  the 
beginning  of  a slot.  The  third  term  in  Eq.  (15)  represents  the 
queueing  delay  at  terminals  of  Source  2.  When  writing  the  queuing 
delay  we  assume  that  packets  arrive  according  to  a Poisson  distri- 


'■*&&?  MfPfcsf 


m 
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bution  and  require  constant  service  time  equal  to  the  packet  trans- 
mission time  (an  M/D/1  queueing  system) . Note  that  when  n 
then  Eq.  (15)  takes  the  same  form  as  Eq.  (14). 

The  delay  equations  hold  for  the  split  as  well  as  the  common 
channel  configurations.  However,  if  one  assumes  the  same  packet 
length  in  each  case  and  the  slot  time  on  the  common  channel  is  one 
unit,  then  the  slot  time  on  the  split  channel  will  be  1/C^  and 
I/C2  for  the  respective  channels.  This  has  been  taken  into  account 
when  showing  the  delay  curves. 

Figures  2,  3,  and  4,  show  the  delay  as  a function  of  throughput 

for  a = .5,  a = 2.5,  a = 10.,  and  for  n = 1,  3,  and  Other 

parameters  are  C.  = C~  and  \ and  k = 4.  From  Figure  1,  one  can  see 

that  a = .5  and  a =10.  corresponds  to  the  case  where  S > S ? on 

c s 

★ * 

the  other  hand  a = 2.5  corresponds  to  the  case  where  S < S . The 

c s 

throughput  shown  is  the  sum  of  both  sources  and  the  delay  is  an 

average  of  D,  and  D„  weighted  by  the  throughputs  of  the  sources, 
f * * 

One  can  see  that  when  S > S the  common  channel  configuration 

c s 

results  in  lower  average  delay  for  all  values  of  throughput  (for 

★ * 

the  same  value  n) . On  the  other  hand,  if  Sg  > Sc  (Figure  3),  there 
are  operating  points,  for  example  when  the  throughput  is  .55,  where 
the  split  channel  configuration  is  better  both  in  throughput  and 
delay.  However,  even  in  this  case,  when  the  system  operates  at 
low  throughputs,  the  common  channel  configuration  results  in  lover 
values  of  delay.  This  is  due  to  the  differences  in  the  slot  times. 
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FIGURE  3 

DEALY  VS.  THROUGHPUT,  a = 2.5 
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CONCLUSIONS 


We  relate  the  results  obtained  to  the  Packet  Radio  System. 

If  one  assumes  that  a packet  which  originates  from  a terminal  will 
always  be  directed  towards  a station  then  a has  the  practical 
meaning  of  the  average  number  of  response  packets  from  station  to 
terminal  for  every  packet  which  is  successfully  transmitted  from 
a terminal  to  a station. 

It  is  demonstrated  that  if  this  ratio  a is  known,  then  one 
can  split  the  channel  into  two  to  obtain  a higher  maximum  utili- 
zation of  the  total  channel  capacity.  It  is  also  shown  how  to 
split  the  channel.  However,  if  the  channel  is  split  and  ihe  system 
operates  at  low  values  of  throughput,  then  the  average  delay  is 
higher  than  would  result  from  operating  in  a common  channel  confi- 
guration. Another  conclusion  is  that  if  a is  not  known,  or  if  it 
varies,  then  the  common  channel  configuration  will  be  better  in 
terms  of  throughput  and  delay. 

Finally,  we  note  that  the  system  analyzed  models  the  Packet 
Radio  System  when  there  is  no  coordination  among  the  stations  in 
choosing  the  slot,  so  that  packets  from  two  stations  can  collide 
on  the  same  slot.  In  practice,  however,  one  may  consider  having 
signalling  channels  among  the  stations  to  obtain  the  above  coordin- 
ation dynamically.  In  this  case,  the  set  of  stations  has  to  be 
considered  as  a single  source;  that  is,  n = 1. 
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AREA  COVERAGE  BY  LINE-OF-SIGHT  RADIO 


1 • PROBLEM  FORMULATION 


We  are  concerned  with  line-of-sight  coverage  of  an  area  where 
mobile  terminals  or  fixed  terminals  are  transmitting  by  radio  from  un- 
specified locations.  The  problem  is  to  locate  repeaters  so  that 
any  such  terminal  will  be  in  line-of-sight  of  repeaters  and  that 
there  be  reliable  connections  between  every  pair  of  terminals 
(and  repeaters) . More  precisely  we  wish  to  minimize  the  in- 
stallation cost  and  maintenance  cost  of  the  repeaters  subject  to 
a constraint  on  the  reliability  of  service. 

In  general,  determining  if  line-of-sight  micro-wave  transmission 
between  two  points  is  possible,  involves  taking  into  account  many 
factors  including  wave-length  (Fresnel  zones),  weather  conditions 
(effective  earth  radius),  antenna  design,  height,  topography,  etc.. 
Nevertheless,  we  shall  assume  that  there  are  known  functions  £(r,t) 
and  Lfr^,^)  that  are  1 if  a repeater  at  location  r can  communicate 
with  a terminal  at  location  t and  if  a repeater  at  location  r^ 
can  communicate  with  a repeater  at  location  r2  respectively  and 
are  0 otherwise. 

From  purely  topographical  considerations  it  is  obvious  that 
the  "flat  terrain"  problem  and  the  "hilly  terrain"  problem  should 
be  handled  seperately.  For  "flat  terrain"  the  problem  is  homogeneous, 
i.e.  installation  costs,  maintenance  costs  can  be  assumed  equal  at 
all  locations  and  the  transmission  properties  are  identical  at  all 
points.  The  "flat  terrain"  problem  is  discussed  in  Section  3; 
a model  is  suggested  for  which  we  compute  an  optimal  solution. 

The  primary  concern  in  the  ensuing  paragraphs  is  with  the 
"hilly  terrain"  problem.  By  that  one  should  understand  hilly 
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topographies  as  well  as  flat  topographies  that  cannot  be  viewed  as 
homogeneous  from  a cost  or  transmission  viewpoint. 

In  "hilly  terrain"  it  is  impractical  to  consider  all  possible 
locations  of  repeaters  and  terminals,  which  theoretically  are 
infinite  in  number.  We  shall  limit  ourselves  to  a finite  set  R 
of  possible  repeaters  locations  and  a finite  set  T of  possible 
terminal  locations.  How  the  set  R and  T are  chosen  will  be  of 
great  computational  importance  and  will  probably  be  chosen  adaptively. 
But  for  the  time  being,  we  assume  R and  T known  and  fixed. 

The  principal  and  immediate  interest  is  in  an  appropraite 
mathematical  model  of  the  situation  and  some  indications  on  how 
to  solve  the  problem.  The  first  problem  is  the  proper  choice  of 
reliability  measure  or  grade  of  service.  We  assume  that  the  radio 
network  is  for  local  distribution-collection  of  terminal  traffic 
with  rates  small  compared  to  the  channel  capacity  so  that  through- 
put capacity  is  not  a constraint.  That  is,  if  any  path  through  the 
network  exists  for  a given  pair  of  terminals  we  assume  there  is 
sufficient  capacity  for  traffic  between  them.  Possible  measure 
of  network  reliability  that  have  proved  useful  in  the  analysis 
of  communication  networks  [13]  are  the  probability  that  all  terminal 
pairs  can  communicate  and  the  average  fraction  of  terminal  pairs 
which  can  communicate.  However,  for  network  synthesis  as  distin- 
guished from  analysis  these  approaches  appear  too  difficult  both 
from  computational  and  data  collection  points  of  view.  This  sug- 
gests the  "deterministic"  requirement  that  there  exist  k node  dis- 
joint paths  between  every  terminal  pair.  This  guarantees  that  at 
least  k repeaters  or  line  of  sight  links  must  fail  before  any 
terminal  pair  is  disconnected.  Let  the  cost  of  a repeater  at 
location  r e R be  c(r)  and  c(R°)  = Z {c(r) |reR0}  where  R°CR.  Then 
we  can  formulate: 

Problem  I:  Find  R*CR  minimizing  c(R*)  subject  to  the  con- 

straint that  for  all  t T and  r R there  exist  k node  disjoint 
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paths  from  t to  r in  the  network  N (t;R*)  where 

N ( t ; R* ) = [R*u{t} ,{ (r1,r2)  |L  (r1,r2)=l)u{ (r,t)  | £ (r ,t)=l)] 

One  might  demand  only  that  there  be  k node  disjoint  paths 
between  every  pair  of  terminals  instead  of  between  each  terminal- 
repeater  pair,  but  we  are  assumming  that  communication  always 
takes  place  through  a "station"  which  could  be  any  of  the  repeaters. 
The  analysis  of  the  terminal  to  terminal  model  is  similar  in  any 
case . 

The  following  two  propositions  motivate  a new  Problem  II, 

Proposition  1;  |{r|£(r,t)  = l}nR*|^_k. 

Proposition  2 : For  all  r e R*  , | {r.jj  L (r  ,r^)  =1}  |_>k-l. 

If  for  each  r c R there  exists  t c T such  that  l(r,t)=0  then 
| { r x | L (r ,r1) =1} | >k. 

Problem  II:  Choose  R*  C R to  minimize 

C (R*)  subject  to: 

1.  For  all  t e T , | { r | £ (r  , t)  = 1}HR*  | >k 
(the  k-fold  set  covering  problem) . 

2.  For  all  r-^r^R*,^  ± r2,  there  exist  k node  disjoint 
paths  connecting  r^  to  r2. 

(the  minimum  cost  redundant  network  problem) . 

The  virtue  of  II  - as  compared  to  I is  that  II  is  an  amalgam 
of  two  well  studied  network  problems,  the  set  covering  problem 
[7]  and  a problem  closely  related  to  the  minimum  cost  redundant 
network  problem  [11] . We  can  then  attack  the  problem  using  pre- 
viously developed  techniques. 

Problem  I and  II  are  related  by: 

Proposition  3:  Any  R*  satisfying  the  constraints  of  II 

also  satisfies  the  constraints  of  I. 
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Proof ; Suppose  R*  satisfies  the  constraints  of  Problem  2 and 
violates  those  of  Problem  I.  Then  there  exists  a terminal  tg  and 
a repeater  r such  that  there  are  not  k node  disjoint  paths  connect- 
ing them  in  N(tQ;R*).  According  to  the  Menger  Graph  Theorem  [8], 
there  are  k-1  repeaters  r^,...,rk_^  not  including  rQ  such  that  when 
these  repeaters  are  removed  from  R*  there  is  no  path  from  t to  r^. 
But  this  leads  to  a contradiction.  Removing  k-1  repeaters  cannot 
disconnect  tg  from  the  net  by  1 of  Problem  II.  Thus,  there  is  an 
arc  (tg,r')  with  (r',t°)  = 1.  Similarly,  k-1  repeaters  cannot  dis- 
connect r'  from  rg  by  2.  Thus,  there  is  a path  from  tg  to  r. 

The  converse  is  not  necessarily  true;  that  is,  there  may  be 
feasible  solutions  to  I which  are  not  feasible  to  II.  Figure  1 
depicts  a counter-example  to  a possible  converse  for  k=2. 


However,  by  Proposition  3,  solving  II  is  fail-safe  in  the  sense 
that  the  solution  obtained  will  always  be  feasible  for  Problem  1. 

Feasible  solutions  to  I which  are  not  feasible  in  II  will  be 
quite  rare.  In  order  for  this  to  happen,  there  would  have  to  be 
two  repeaters  r^  and  in  R*  (where  R*  is  an  assumed  solution  to 
I)  for  which  there  are  not  k node  disjoint  paths  joining  them  yet 
for  which  all  terminals  in  the  vicinity  of  r^  can  communicate  by  k 
node  disjoint  paths  with  and  conversely.  This  does  not  happen 

unless  there  are  very  few  terminals  and  many  repeaters  near  r-^  and 
which  from  the  physical  nature  of  the  problem  is  highly  unlikely. 
On  the  other  hand,  as  pointed  out  above,  artificial  counter-examples 
to  a possible  converse  of  Proposition  3 can  easily  be  constructed. 
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II.  COMPUTATIONAL  TECHNIQUES 

The  following  problem  will  be  referred  to  as  a k-cover  problem. 

Let  N be  a bipartitie  graph  with  edges  E and  nodes  (R,  T) . The  edges 
E connect  nodes  of  R to  nodes  of  T.  Then  the  following  is  a k-cover 
problem: 

Find  R*CR  such  that  valence  of  each  T-node  in  N*  is  at  least  k 
and  such  that  the  cardinality  of  R*  is  minimum  (number  of  elements 
of  R*),  where  N*  is  the  subgraph  obtained  by  deleting  all  nodes 
RN R*  (where  \ indicates  set  difference)  and  all  edges  connected  to 
these  nodes.  This  is  also  known  as  the  =-width  problem  [4],  [10]. 

The  1-Cover  Problem  is  the  classical  set  covering  problem.  Ex- 
tensive research  has  been  and  is  being  done  on  the  1— cover  problem, 
see  e.g.,  [7]  and  references  mentioned  therein,  see  also  [1]  (A 

special  case  of  the  1-cover  problem  is  the  simple  covering  problem 
where  each  R-node  is  connected  to  exactly  2 T-nodes.  For  this  pro- 
blem algorithms  are  known  that  are  "efficient"  i.e.,  with  known  poly- 
nomial bounds  on  the  number  of  operations  [7].  So  far,  practice 
does  not  seem  to  have  singled  out  a "best"  algorithm  to  solve  the 
general  1-cover  problem.  But  in  any  case,  those  available  seem  to 
be  much  more  efficient  than  solving  the  problem  as  a straight  integer 
programming  problem. 

a • Every  k-cover  problem  is  equivalent  to  a 1-cover  problem. 
To  establish  the  assertion  let  us  consider  the  case 
when  k=2.  The  generalization  to  arbitrary  is 
important  but  it  is  easier  to  see  the  proof  for 
the  case  k=2. 
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Let  us  consider  a bipartite  graph  constructed 
in  the  following  manner:  Start  with  a 2-covering 

problem.  Let  L be  the  number  of  elements  in  R. 

Take  L copies  of  the  T-nodes.  Connect  node  r^  to 
all  copies  of  node  t.  (if  r.,  t.)  E of  the  2-cover 

1 ^ D a.  u 

problem,  except  for  the  copy  of  tj  in  the  i copy 
of  T.  (See  Figure  2.) 


FIGURE  2 


We  call  this  problem  the  1-cover  problem  generated 
by  the  2-cover  problem. 


Proposition : Every  solution  to  the  2-cover  problem  yields  a 
solution  to  the  generated  1-cover  problem  and  conversely. 

In  particular,  the  optimal  solution  of  the  2-cover  problem 
yields  an  optimal  solution  to  the  generated  1-cover  problem  and 
conversely. 
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Proof : 


The  first  part  (a  2-cover  solution  yields  a solu- 


tion to  the  generated  1-cover)  is  trivial.  In  the  other  direction, 
t h 

suppose  the  n copy  of  t . is  "covered"  by  node  r.  the  r.e  optimal 

3 ^ t h ^ 

1-cover  of  the  generated  1-cover  problem,  but  in  the  i copy,  t ^ 
must  be  "covered"  by  another  node  in  R.  So  every  node  is  "covered" 
by  at  least  2-elements  of  R.  The  optimality  statements  follows 
readily . 

For  k-covers  (k>2)  a similar  proposition  can  be  found  but  then 
the  number  of  copies  of  T needed  (L  in  these  case)  go  up  exponen- 
tially. (We  suspect  that  this  type  of  transformation  is  similar 
to  reducing  general  integer  variables  in  integer  programs  to  0-1 
variables . ) 


Important  remarks : If  the  problem  is  formulated  as  a k-cover 
problem,  one  should  realize  that  by  the  above  results  one  might 
suspect  that  the  k-cover  problem  is  much  more  complicated  than  the 
1-cover  problem.  This  is  corrobarated  by  the  experience  with  in- 
teger programming  algorithms.  The  only  class  of  problems  which 
can  be  solved  with  any  form  of  computational  success  are  those 
involving  only  0 and  l's. 

b.  The  Covering  Problem  as  an  Integer  Program.  Let  A be  a 
(T , R)  matrix  (where  the  rows  correspond  to  terminals  and  the 
columns  to  repeaters).  The  size  of  A is  |t|x|r|.  We  have  that 
= 1 or  0 depending  on  whether  terminal  i is  "visible"  or 
not  to  repeater  j . In  terms  of  the  bipartite  graph  N,  the 
entry  a. . = 1 if  there  is  an  edge  between  nodes  r.  and  t., 
the  entry  a^  = 0 otherwise.  We  can  then  formulate  the  k-cover 
problem  as  follows: 


Min  Ex  . 

1 

(I.P)  such  that  Z a.,  x.  ^ k i = 1 , . . . , | T 

j 13  3 


with 


x . = 1 or  0 
] 
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This  i.  an  integer  program  (sometimes  called  a program  in 
boolean  variables).  A number  oi  algo.rithms  to  solve  integer 
programs  are  known.  See  [7]  for  a sur/ey.  These  algorithms  offer 
little  hope  in  solving  the  location  problem  arising  in  the  Packet 
Radio  project  in  that  |r|  and  |t|,  i.e.,  the  number  of  variables 
the  number  of  constraints  respectively,  are  much  too  large 
for  existing  methods  for  k-cover  problems. 

c . Approximate  to  the  k-cover  Method  by  Linear  Programming : 
The  integer  programming  problem  formulated  above  can  be 

replaced  by  a linear  program 

Minimize  Ex . 

j 3 

'L.P.)  such  that  E a.  . x.  _>k  i = 1,...,|t|. 

j 3 3 

0<x  . <1 
- D- 

Where  the  constraint  x..  = 0 or  1 has  been  replaced  by  the 
constraint  0 <_  x^  <1.  The  difference  is  obvious,  the  solution  to 
(L.P.)  will  contain  fractional  values,  but  we  note  that  if  a 
variable  is  profitable"  , it  will  usually  be  made  as  large  as 
possible.  The  constraints  E aij  * j— ^ 130  generate  upper 
bounds  on  the  x^-'s.  Thus  orle  might  expect  many  0 and  l’s  in 
the  optimal  solution  of  the  L.P.  problem. 

An  upper  bound  for  the  optimal  solution  to  the  (I.P.)  integer 
program  can  be  obtained  by  "pushing"  all  the  fractional  x.'s 
appearing  in  the  optimal  solution  to  1.  Obviously,  that  yields 
an  upper  bound  on  Min  Ex^ . The  optimal  solution  of  (L.P.)  yields 
a lower  bound. 

There  are  many  ways  to  improve  the  above  solution.  (without 
actually  going  all  the  way  to  integer  programming) . Let  us 
suggest  two. 
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Scheme  1. 


Let  x p be  the  optimal  solution  of  the  (L  P ) 

. - nrvt-  " ' 


_ L -x-  uuxui.±UiJ  VJ  i_ 

problem.  Set  x*  = l if  v0Pt  _ i , . 

C x.  1 lf  x = 1 and  x*  = 0 otherwise. 

] j 

Let  k*L  = Max  {O^-Za^  x* } . Construct  A*  as  follows:  Remove 

from  A all  columns  j such  that  x*.  = 1 and  remove  all  rows  with 
i 0.  Set  k - {k*]k*  / 0}.  You  have  now  a reduced  problem. 


It  is  again  an  integer  program. 

Min  Lx . 

1 

(I.P.S.l)  s.t  A*  x > k* 

x.  = 0 or  1 

The  problem  is  substantially  reduced  in  size.  We  can  now 
use  an  integer  programming  algorithm. 


-C--Gme  2-  0nce  the  solution  to  the  L.P.  is  obtained.  A 

branch-and-bound  algorithm  can  be  developed  to  be  continued  until 
a sufficiently  small  difference  exists  between  the  best  solution 
and  the  smallest  upper  bound  obtained  so  far.  Note  that  by  the 
remarks  preceeding  Scheme  1 it  is  always  very  easy  to  obtain 
upper  and  lower  bounds. 


There  is  also  the  possibility  to  use  "integer  cuts",  see 

[7]*  ThiS  ^ integer  Programming.  However,  there  is  a possibility 
that  for  k-cover  problem,  good  cuts  can  actually  be  constructed 


and  recognized.  That  is  one  direction  of  research 
come  imperative)  but  which  has  not  been  pursued, 
acterizing  cuts  has  been  achieved  in  the  past  for 
problems,  see  [2]  and  [5]. 


(that  might  be- 
Success  in  char- 
highly  structured 


d‘  Network  Flow  Problem  with  Concave  Cost:  For  the 

sake  of  completeness  we  record  one  other  formulation.  The 
problem  is  to  find  a feasible  flow  (in  the  network  to  be 
described  below)  at  minimum  cost.  One  possible  advantage  is 
the  potential  use  of  algorithms  specifically  developed  for 


I 


J 
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fixed  charge  problems  or  branch-and-bound  methods  on  nets,  see  e.g., 
[9]  and  [14],  We  consider  the  bipartitie  graph  N as  described  in 
the  beginning  of  this  section  (2),  Figure  3 to  which  we  add  a source 
s and  a link  S.  Let  f (x,y)  be  the  blow  between  nodes  x and  y,  we 
have  the  following  constraints: 

for  all  x,y 
for  all  y e R 
For  all  x e T 

The  upper  bound  c (s,y)  is  set  equal  to  the  valence  of  y e R 
minus  1 and  for  all  x e T,p(x,S)  = k.  The  cost  a[f(x,y)]  = 0 for 
all  arcs  except  for  arcs  (s,y),  y e R where. 

a [f  (s,y) ] = 0 if  f (s,y)  = 0 

a[f(s,y)]  = 1 otherwise. 


f ( x , y ) _>  0 
f (s,y)  <_  c ( s , y ) 
f (x,  S)  p (x,  S) 
and  f (x,y)  integer 


FIGURE  3 

BIPARTITE  GRAPH  WITH  SOURCE  AND  LINK 
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e*  A Heuristic  Approach  to  the  k-cover  Problem.  Given  the 
limited  success  of  integer  programming  algorithms  in  solving 
laige  scale  problems,  we  have  been  led  to  consider  heuristic 
methods  to  find  good  solutions  to  the  k-cover  (of  terminals 
by  repeaters)  problem  which  is  typically  large  scale.  It  is 
intuitively  appealing  to  consider  a terminal  as  particularly 
critical  if  it  is  adjacent  to  few  repeaters.  (In  the  extreme 
CuSes > if  a terminal  has  fewer  than  k adjacent  repeaters,  the 
problem  is  infeasible  and  if  it  has  exactly  k adjacent  repeaters, 
all  of  them  must  be  chosen  for  any  feasible  solution.)  Similarly, 
a I^Peat ££  is  desirable  if  it  is  adjacent  to  a large  number  of 
terminals,  especially  if  the  terminals  are  highly  critical. 

The  heuristic  algorithms  described  below  systemizes  these  in- 
tuitive notions  in  the  search  of  a "good"  solution. 

Again,  consider  the  problem  in  matrix  form,  but  this  time 
we  imbed  the  problem  in  a more  general  class  where  we  can  require 

a different  cover  multiple  for  each  terminal.  The  more  general 
problem  is: 


( I . P . H . ) Min  L . x . , 

3 3 

such  that  Z.a..x.  >k.,i=l 
3 ID  3 ~ 1 

with  Xj  = 0 or  1, 


T | , 


where  k..  represents  the  cover  multiple  required  for  terminal  i. 
If  k^  0,  then  no  repeater  is  needed  to  cover  terminal  i. 

One  iteration  of  the  heuristic  method  consists  in  passing 
from  a problem  of  the  type  (I.P.H.)  to  an  "equivalent"  problem 
by  fixing  one  of  the  variables,  say  xg , as  its  upper  bound  1, 
and  putting  the  index  s in  an  index  set  J.  This  implies  the 
selection  of  the  corresponding  repeat.. r.  The  new  problem, 
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denoted  by  (I.P.H.)  is  obtained  from  (I.P.H.)  by  adjusting 

the  matrix  A = [a. .]  and  the  cover  requirement  vector  k = [k.] 

s 

as  follows:  the  column  A = fa.  ] is  deleted  from  A and  the 

_is 

adjusted  vector  k is  given  by  k^  = k^  - a^g.  The  variable 
no  longer  appears  in  (I.P.H.)  . The  algorithm  terminates 
if  at  any  iteration  it  is  recognized  that  for  the  adjusted  A 
and  k,  for  some  i,  Y ^ a^  < k^ , the  problem  is  then  infeasible ; 
or  as  soon  as  all  the  adjusted  k^,  i = 1,...,T,  become  non- 
positive. In  the  latter  case,  the  problem  is  feasible , the 
adjusted  problem  is  optimized  by  setting  all  remaining  x^  = 0, 
(for  j i J) . A feasible  solution  x to  the  original  is  obtained 
by  setting  x.  = 0 if  j / J and  x,  = 1 if  j c J.  The  vector  x 
is  called  the  heuristic  solution . 

The  equivalence  of  the  new  problem  (I.P.H.)  and  the  earlier 


version  (I.P.H.)  depends  naturally  on  the  choice  of  the  variable 
xg.  If  xg  is  1 in  an  optimal  solution  to  (I.P.H.),  then  the  two 
problems  are  equivalent  in  the  sense  that; 


Min  of  (I.P.H.)  = [Min  of  (I.P.H.)  ] + 1 

n 


Equivalence  of  the  set  of  optimal  solutions  is  guaranteed 
only  if  x = 1 in  every  optimal  solution  of  (I.P.H.)  (we  must 
ignore  x when  considering  optimal  solutions  to  (I.P.H.)). 

The  selection  criterion  to  choose  a variable  at  each 
iteration  xg  (or  equivalently  the  index  s)  to  be  fixed  at 
value  1 can  be  viewed  is  a function,  called  o,  of  the  adjusted 
matrix  A and  vector  k with  values  in  the  index  set  (j),  i.e., 


c : (A,k)Ks,se{j). 
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There  obviously  exists  a function  o - a selection  criterion 
which  will  guarantee  the  equivalence  of  (I.P.H.)  and  (I.P.H.) 
at  each  iteration  and  consequently,  the  optimality  of  the  re- 
sulting heuristic  solution.  However,  applying  this 
function  - to  [A,k]  might  involve  no  less  than  solving  (I.P.H.) 
We  use  a heuristic  motivated  by  the  considerations  mentioned 
at  the  beginning  of  this  section.  The  adjusted  A and  k are 
used  to  compute  the  "probability"  that  a given  x.  belongs  to 
the  optimal  solution,  by  this  we  mean  that  to  each  column  of 

the  adjusted  matrix  A we  associated  a nonnegative  number  co  . 
such  that;  3 


pj  - Wj  . (Z  co . | j i J 


-1 


represents  very  loosely  speaking  - the  probability  that  x.  be- 
longs to  the  optimal  solution.  We  select  x , s / J if  P \ p. 
for  j / J or  equivalently  if  ^ > w for  j i J.  The  selection 
criterion  a will  be  completely  determined  if  we  specify  a 
method  to  compute  the  to. 


1 


J 

In  the  selection  of  these  weights  co^  , we  must  take  into 


consideration  the  effort  involved  in  the  computation  as  well 
as  the  reliability  of  the  resultant  selection.  We  have  used 
four  such  weights: 


We  first  define; 
k*  = 1/2  [|k.|  + k.] 


Observe  that  k*  = 0 if  k.  0 and  kt  = k otherwise 


We  set: 


w-15  (A,k)  = 


Zi  (- 


k* 

1 


h au  - ki 


11 


j t J 
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tu  j 2 } (A,k)  = lL  ( 


* 

k . 
1 


^ l ai  £ 


a . 
i] 


j t J 


u.(3)  (A,k)  = y.  ( 

D 1 


y a.  - k. 

L i f.  i 


) ai:  j / J 


t,(4)  (A,k: 


= h ( 


^ £ 3 i /. 


■)  a . . 
il 


1 / J 


The  entries  between  parentheses  ( ) in  the  definition  of  weights 

u,(  measures  how  critical  terminal  i is.  If*ki  = the  terminal 
does  not  need  covering  then  k*/(IjcJ  - k*)  = 0 (for  our  purposes 

0/o>  = 0).  On  the  other  hand,  if  = k^,  namely  there  are 

exactly  enough  repeaters  to  cover  the  terminal  then  the  weight  is 
infinite.  A repeater  will  be  preferred  to  another  one  if  it  covers 

more  critical  terminals. 

Computational  Experience:  The  size  of  test  problems  solved 

varies  from  problems  with  as  few  as  5 repeaters  and  5 terminals  to 
problems  with  as  many  as  400  repeaters  and  400  terminals.  Roughly 
speaking,  the  computation  time  was  directly  proportional  to  the 
size  of  the  adjacency  matrix  A and  the  cover  multiple  required.  The 
computer  used  was  a PDP-10  (time  sharing).  The  larger  problems 
(400  repeaters,  400  terminals,  2 - cover)  were  solved  in  70  sec.  or 
less.  The  time,  as  may  be  expected,  is  dependent  on  the  density 
of  1 1 s in  the  incidence  matrix  A.  Thus,  the  maximum  time  recorded 
arose  from  terminals  - repeaters  configuration  where  each  repeater 

2 

covers  many  repeaters.  The  running  time  is  of  the  order  of ( T | x | R | 
where  | T | and  | R [ and  the  number  of  terminals  and  repeaters  respectively. 

We  ran  a number  of  problems  with  the  heuristic  code  and  for 
comparison  with  the  Ophelie  mixed  integer  programming  code  running 
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on  a CDC  6600  computer.  The  Ophelie  code  uses  the  branch-and-bound 
method.  In  the  case  of  very  .simple  problems  (8  repeaters,  9 ter- 
minals, 2 - cover)  there  was  essentially  no  difference  in  running 
time  (presumably  most  of  the  time,  less  than  .5  sec,  was  spent  in 
setting  up  the  problem).  Running  experience  with  the  Steiner  triples' 
problem  described  in  the  next  section,  yields  a ratio  of  500  to  1 
between  the  Ophelie  time  and  the  heuristic  code  time  when  solving 
the  smaller  problem  A2?  (117  terminals,  30  repeaters,  1 - cover), 
and  no  comparison  is  available  for  the  larger  problem  A45  (330  ter- 
minals, 45  repeaters)  since  for  example  the  MPSX  code  failed  to 
reach  a solution  in  more  than  one  half  hour  on  a IBM  360-91.  * 

Comparison  in  running  time  is  naturally  not  completely  valid, 
since  most  of  the  computation  time  in  the  Ophelie  code  can  be  spent 
just  checking  if  a given  solution  is  optimal.  The  heuristic  method 
does  not  try  to  check  the  optimality  of  its  solution.  However,  in 
general,  results  with  the  heuristic  code  have  been  extremely  good 
When  the  heuristic  solution  deviated  from  the  optimal  solution,  the 
problem  usually  involved  numerous  ties  for  the  maximum  to  f ^ ^ £=1 , 2 , 3 , 4 , 
such  as  in  the  Steiner  triples'  problems.  In  all  problems  that  were 
generated  to  resemble  the  packet  radio  terminal  - repeater  problem, 
the  heuristic  algorithm  reached  the  optimal  solution  (in  those  problems 
for  which  we  are  able  to  determine  the  optimal  solution) . 

The  running  time  was  unaffected  by  the  choice  of  any  of  the 
selection  criteria  but  for  "hard"  problems,  we  obtained  consistently 
better  solutions  when  using  <^a)  and  wj2)  rather  than  wj3)  and  co_j4). 


The  Steiner  Triples'  Problem;  Fulkerson,  Nemhauser  and  Trotter 
[6]  report  on  two  covering  problems  which  they  characterize  as  com- 
putationally difficult.  In  each  problem,  the  matrix  A is  the  inci- 
dence matrix  of  a Steiner  triple  system.  The  first  problem,  labelled 
A27  is  a 1-cover  problem  with  117  terminals  and  30  repeaters. 


* Private  Communication,  R.  Fulkerson,  June,  1974 
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The  second  problem,  labelled  A^,  has  330  terminals  and  45  repeaters 
and  is  also  a 1-cover  problem.  Data  for  both  problems  can  be 
found  on  pages  9 and  10  of  [6] . The  problems  are  considered  to  be 
difficult  because  the  large  number  of  verifications  (branching  in 
branch-and-bound , cuts  in  cutting  methods)  required  to  establish 
that  a given  solution  is  in  fact  optimal. 

In  our  runs,  the  variable  to  be  fixed  at  1 (the  selected  repeater) 
at  each  iteration  was  selected  by  using  the  criterion  resulting  from 
using  weights  w.  . In  the  case  of  ties  for  the  maximum  weight  w . , 
the  variable  with  smallest  index  was  chosen.  Due  to  the  inherent 
symmetries  present  in  these  problems,  numerous  ties  did  occur.  For 
example,  all  weights  are  equal  in  the  first  iteration.  Thus,  the 
tie  breaking  rule  plays  a relatively  important  role  in  the  selection 
of  a solution.  We  solved  both  problems  100  times  breaking  ties  by 
random  selection  among  all  tied  variables. 

The  frequency  of  the  values  generated  by  the  heuristic  solutions 
is  recorded  in  the  table  below.  In  all  cases  the  heuristic  obtained 
the  optimum  solution  for  the  smaller  problem  A27. 


Heuristic 

Minima 

30 

31 

32 

34 

A.  r 

45 

3 

44 

29 

24 

The  total  running  for  100  solutions  for  A ^ (including  the  generation 
of  random  numbers  to  break  ties)  required  less  than  1/5  of  the  time 
required  to  solve  by  branch-and-bound  (even  giving  the  optimal 

solution  as  a starting  solution  as  recorded  in  [6].  Approximately 
5 sec.  were  necessary  to  obtain  a heuristic  solution  to  the  larger 
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problem  A^.  (The  branch-and-bound  algorithm  failed  to  produce 
a solution  to  A45.  H.  Ryser  has  conjectured  that  the  optimal 
solution  to  A^j.  has  30  repeaters  flO]  . 


How  Accurate  is  the  Heuristic  Method:  Unfortunately,  we  can 

not  obtain  significant  bounds  on  the  error  for  the  heuristic  method 
using  any  of  the  weights  w ^ ' ; 5,=1 , . « . , 4 ( that  determine  the  selection 
criterion.)  We  give  here  an  example,  developed  in  collaboration  with 
Professor  Robert  Bixby  which  shows  that  the  error  can  be  arbitrarily 
large.  The  example  is  a 1 - cover  problem.  Let  T ,...,T  be  dis- 
joint sets  of  indices  with  the  cordinality  of  |t.|  = 2i.n  Set  T = 
UTi * Terrtlinals  are  all  pairs  of  indices  (t,  1)  and  (t,  2)  with 
: T.  There  are  n + 3 repeaters.  Repeaters  R . , i=0,l,...,n  are 
connected  to  all  terminals  with  indices  (t ,1)  and  (t ,2)  with  t e T.. 
Repeaters  Rn  + 1 and  Rr  + 2 are  connected  to  terminals  with  indices 
(t,  1 ) 1 1 e T}  and  { (t,  2)|t  e T } J.  respectively.  For  n = 3,  the 
matrix  A appears  on  the  next  page. 

Observe  that  each  row  contains  exactly  2 nonzero  entries,  thus 
Ij  aj_ j - 2 for  all  i.  it  is  easy  to  verify  that 


Ej  aij  (2-^"T)  = 2 |Tj | = 2^  + 1 


j = 0 , . . . ,n 


“n1!  1 = -n1!  2 = Ij  ai j (1 ) = l 

J J j=0 


= 2n  + 1 - 1. 


Selecting  the  variable  (index)  with  maximal  weight  implies  that 
the  choice  will  be  repeater  Rn>  Eliminating  xn  and  the  correspond- 
ing column  as  well  as  the  rows  corresponding  to  terminals  covered 
by  Rn,  we  obtain  a new  problem  of  exactly  the  same  type  as  the  ori- 
ginal problem.  The  previous  argument  is  independent  of  the  value 
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region  surrounding  the  Bayshore  Freeway) , an  urban  center  (Palo 
Alto  and  neighboring  communities)  on  slightly  sloping  terrain 
and  finally  a hilly  region  (with  valleys,  small  plateaus,  etc.). 
Moreover,  at  this  time,  it  appears  that  a reduced  scale  experiment 
of  a packet  radio  network  will  be  installed  in  the  Palo  Alto  area. 
The  purpose  of  this  section  is  to  give  a description  of  the  design 
of  this  model. 

Location ; We  decided  to  limit  the  investigation  to  the  area 
covered  by  the  topographical  map  known  as  the  Palo  Alto  Quadrangle, 
California,  7.5  minute  series  (topographic),  U.S.  Department  of 
the  Interior,  Geological  Survey  or  equivalently  to  the  area  lying 
between  meridians  37°  22'  30"  N.  and  37°  30'  N.  and  longitudes 
122°  15'  W.  and  122°  07'  j0"  W,  see  Figure  6. 

Terminals  and  Repeaters:  The  map  was  divided  in  180  cells 

(squares)  obtained  by  dividing  the  meridian  direction  (height)  in 
15  equal  parts  and  the  longitudinal  direction  (width)  in  12  equal 
parts,  see  the  map  reproduced  below.  Each  rectangular  subregion 
measures  .9356  km  in  height  and  .9356  km  in  width  which  yields  a 
total  surface  area  of  .87533  km  (or  approximately  .35  miles). 
Forty  two  (42)  locations  were  singled  out  as  potential  loca- 
tions for  repeaters.  In  the  hilly  part  of  the  map,  the  Southwest 
region,  the  high-points  were  selected,  such  as  top  of  hills,  loca- 
tion of  water  towers,  smaller  but  prominent  points  overlooking 
valleys,  etc.  In  the  city,  a number  of  high  rise  constructions 
were  singled  out  as  potential  location  such  as  radio  towers,  high 
rise  apartment  or  office  buildings,  etc. 

Connections  Between  Terminals  and  Repeaters:  By  definition, 

a cell  i was  declared  to  be  covered  by  repeater  j if  a terminal 
located  at  the  worst  possible  location  in  that  cell  i was  in  line 
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of  sight  (LOS)  of  the  repeater  j. 
that  a repeater  located  in  a given 
repeater  located  in  cell  j did  not 


(In  some  cases,  it  turned  out 
cell  k covers  cell  j but  a 
cover  cell  k) . 


LOS  Computation:  To  determine  if  a terminal  at  location  j 

seen  from  a repeater  at  location  k,  we  proceeded  as  follows. 
It  was  assumed,  that  if  no  particular  high  construction  (building, 
water  tower,  etc.)  was  available  to  install  the  repeater's  antenna, 
it  would  be  installed  at  30  feet  above  the  ground  level  (making 
use  ot  a tree,  telephone  pole,  etc.).  The  terminals  were  assumed 
to  be  5 feet  above  ground  level.  The  points  were  said  to  be  in 

LOS  if  the  first  F^esn^l_zone  associated  with  transmission  between 
these  two  points  was  free  of  any  obstacle. 


To  compute  the  Fresnel  Zones,  we  sssumed  that  transmission  would 

occur  at  1500  MHz  corresponding  to  a wave  length  1 = ,2m  (7.87in.). 

e give  here  an  example  of  such  a Fresnel  Zone,  transmitter  and 

repeater  are  assumed  to  be  5 km  apart.  In  the  figure  on  the  next 

page,  we  give  the  radius  for  certain  cross-sections  of  the  Fresnel 
Zones  ( A = . 2m) . 
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Transmission  radius  is  supposed  to  be  less  than  20  km  (not 
an  upper  bound  here  since  the  greatest  distance  between  any  two 
points  in  this  region  is  less  than  18  km).  In  the  urban  area 
the  maximum  transmission  radius  was  assumed  to  be  7 km. 


Cover  Multiple:  In  view  of  the  fact  that  the  region  selected 

was  a subregion  of  the  area  to  be  covered  eventually  by  the  packet 
radio  network,  we  made  some  arbitrary  decisions  as  to  the  boundary 
cells.  Since  they  will  probably  also  be  covered  by  repeaters  lo- 
cated outside  the  Palo  Alto  Quadrangle,  we  are  requiring  that  these 
boundary  cells  be  1-covered  rather  than  2-covered  as  the  other  cells. 

Computational  Results:  The  PaAl  problem,  described  above  was 

solved  by  the  heuristic  algorithm,  given  the  code  name  SETCOV,  and 
by  OPHELIE.  (A  rapid  analysis  of  the  terminal-repeater  adjacency 
matrix  shows  that  none  of  the  optimal  solutions  would  have  been 
generated  if  one  had  used  the  more  simplistic  approach  of  selecting 
the  repeater  with  highest  adjacency  degree.  Such  a selection  yields 
quite  different  answers  requiring  a larger  number  of  repeaters). 

For  the  PaAl  problem,  the  optimal  solution  reouires  the  install- 
ation of  14  repeaters  (different  runs  with  SETCOV  showed  that  there 
were  in  fact  a number  of  optimal  solutions  with  14  repeaters) . The 
total  running  time  for  OPHELIE  was  approximately  12  CPU  sec.  excluding 
set  up  time.  The  SETCOV  required  3 sec.  to  produce  a solution. 

The  relative  success  of  the  OPHELIE  code  must,  at  least  in  part,  be 
attributed  to  the  fact  that  the  linear  programming  solution  (which 
is  used  to  initiate  the  branch-and-bound  part  of  the  code)  is  actually 
the  optimal  solution.  (If  this  is  just  an  isolated  phenomena  to  be 
associated  with  this  particular  problem  or  is  actually  a characteristic 
of  this  whole  class  of  problems  is  not  known) . One  optimal  allocation 
of  repeaters  consists  in  selecting  sites:  4,  5,  7,  11,  16,  17,  18, 

27  , 28  , 30,  34  , 36,  37  , and  42. 
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We  also  solved  a variant  of  the  model  described  above.  The 
presence  of  (small)  ridges  in  cells  8,  23,  and  54  combined  with 
our  model  design  rule  - a cell  is  covered  by  a repeater  if  the  worst 
location  is  that  cell  (subregion)  car.  communicate  with  that  repeater  - 
renders  these  three  cells  "critical"  in  the  sense  that  there  are  ex- 
actly one  (for  cell  8)  and  two  (for  cells  23  and  54)  repeaters  cover- 
ing these  cells.  This  results  in  the  automatic  selection  of  certain 
repeaters.  To  avoid  this  somewhat  peculiar  situation,  we  formulated 
a variant  of  the  PaAl  problem  requiring  no  lower  bound  on  the  number 
of  covers  for  cells  8,  23,  and  54.  This  problem  was  also  solved  by 
OPHELIE  and  SETCOV.  Running  times  were  of  the  same  order  than  before. 
The  optimal  solution  only  requires  12  repeaters  this  time.  Both  codes 
produced  the  optimal  solution,  with  OPHELIE  obtaining  again  the  optimal 

solution  in  the  LP  part  of  the  problem  and  the  SETCOV  using  weights 
(3  ) 

ok  . One  optimal  allocation  of  repeaters  consists  in  selecting  sites: 
4,  7,  11,  17,  18,  26,  28,  3 0,  31,  36,  37,  42.  An  Ophelie  solution  is 


depicted  in  Figure  7. 

g.  The  Generalized  k-covering  Problem.  It  is  not  always  plau- 
sible to  assume  that  the  installation  and  maintenance  costs  associated 
with  various  repeaters  at  different  locations  is  the  same.  This 
shortcoming  of  the  previous  model  is  overcomed  by  associating  differ- 
ent cost  to  repeaters  in  the  objective  (of  (I.H.P.)).  An  obvious 
adaptation  of  the  heuristic  method  described  in  Section  3 replaces 
the  weights  (i)  = 1,  ...,  4 used  in  before  by  where  c^ 

is  the  cost  (>  0)  associated  with  repeater  j. 
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111  • THE  "FLAT  TERRAIN11  MODEL.  A PROPOSED  SOLUTION 

a • Problem  Formulation. 

It  is  assumed  that: 

(i)  The  terrain  is  nearly  flat 

(ii)  The  installation  (and  maintenance)  expenses  tor  repeaters 
are  independent  of  location. 

(iii)  Transmission  characteris tics  are  invariant  with  location. 

A repeater  communicates  with  a terminal  if  and  only  if  they 

are  less  than  a fixed  distance  dfc  apart.  The  maximum  distance 

for  communication  between  repeaters  is  assumed  to  be  d (In 
• r 

practice  d^  is  substantially  larger  than  d because  repeater 

antennas  are  higher  than  terminal  antennas).  The  area  coverage 

by  L.O.S.  radio  can  again  be  separated  into  two  parts,  a 

covering  problem  and  a connectedness  problem  (see  Section  1 

Problem  II).  Let  P be  the  2-dimensional  plane. 

Covering  Problem.  Find  a minimal  covering  of  P by  discs  of 
radius  dfc  such  that  every  point  of  P is  covered  at  least  k times. 

Connectedness  Problem.  Let  G = (N,E)  be  the  graph  obtained  as 
follows:  The  nodes  N are  the  centers  of  the  discs  used  in  the 

minimal  covering  of  P.  The  edges  E of  G are  obtained  by  connecting 
two  nodes  if  their  distance  is  less  or  equal  to  d . The  graph  G 
is  to  be  q-connected  (reliability). 

Since  we  are  considering  an  infinite  plane  one  can  no  longer 
define  minimality  of  a cover  in  terms  of  its  cardinality.  There 
are  various  proceedures  to  define  minimality,  for  example,  the 
cover  with  the  smallest  percentage  of  area  wasted  or  if  lim 
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~2~  6C  ^ is  minirnized  over  the  space  of  all  covers  C of  P that 
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satisfy  the  covering  and  connectedness  constraints,  where  5c  (r) 
is  the  number  of  discs  of  C whose  interior  intersect  a circle  of 
radius  r centered  at  the  origin  (an  arbitary  but  fixed  point  of  P). 

b.  Solutions,  A conjecture.  An  optimal  solution  to  the  above 
problem  is  known  for  k = 1,  g <_  6 and  d^/d  _>  /3  (that  is  the 
maximum  distance  for  transmission  between  repeaters  is  at 
least  73.3%  larger  than  that  between  terminals  and  repeaters). 
The  problem  is  then  the  standard  covering  of  the  plane  by 
discs  of  fixed  radius  and  with  least  overlap.  In  [12]  , it 
is  shown  that  the  optimal  solution  is  given  by  arrange- 
ment found  below,  which  consists  in  placing  a circle 
of  radius  d at  each  vertex  of  a regular  triangular  tessala- 
tion  whose  grid  points  (vertices)  are  dfc  /3  apart.  Since 
d^  _>  dfc  /3  it  follows  that  the  resulting  graph  G contains  as 
subgraph  the  regular  triangular  tesselation  whose  grid  points 
(vertices)  are  dfc  /3  apart  which  is  6-connected.  In  Figure  8 
the  area  of  the  discs  :overed  twice  is  shaded. 


FIGURE  8 

We  do  not  consider  any  other  cases  for  k = 1 since  q < 6 
and  d^  _>  d^_  /3  will  always  be  satisfied  in  practice. 

For  k = 2,  the  optimal  solution  is  not  known  (whatever  be 
q and  dr^dfc) . However,  due  to  the  inherent  symmetry  of  the 
problem,  we  are  ready  to  conjecture  that  the  centers  of  the 
optimal  solution  produces  a regular  grid  of  points  in  P. 

Making  this  conjecture  our  working  hypothesis,  there  ar.  only 


10.28 


Network  Analysis  Corporation 


three  cases  to  consider,  the  three  regular  tesselat ions : (i) 

the  tesse 1 aticn  cf  P by  equilateral  triangles,  (ii)  by  squares, 
(iii)  by  hexagons  [2]. 


( i i i ) 


FIGURE  9 THE  REGULAR  TESSELAT IONS 

Each  vertex  of  the  tesselation  corresponds  to  a repeater  and 
if  every  point  in  P is  to  be  covered  twice,  then  the  distance  between 
two  adjacent  repeaters  should  not  exceed  dt.  Assuming  that  dfc  is 
the  distance  between  two  adjacent  vertices  of  the  tesselations , then 
in  the  triangular  tesselation  one  finds  6 repeaters  at  distance  d , 

6 repeaters  at  distance  /3dfc,  6 repeaters  at  distance  2dfc  ...  from 
any  given  repeaters.  In  the  square  tesselation  one  finds  4 repeaters 
at  distance  dfc,  4 repeaters  at  distance  /2d  , 4 repeaters  at  dis- 
tance 2dfc  ...  from  any  given  repeaters.  Finally,  for  the  hexagonal 
tesselation,  there  are  3 repeaters  at  distance  dfc,  6 repeaters  at 
distance  /3dfc,  3 repeaters  at  distance  2dfc,  ...  from  any  repeater. 
With  each  tesselation  one  can  associate  a repeater  density.  It  is 
easy  to  see  that  there  is  1 node: 
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per  v 3/2  d^  units  of  area  in  case  (i) 

2 

per  d units  of  area  in  case  (ii) 

_ 2 

per  3i3  d units  of  are.-’  in  case  (iii) 

4 c 

Assuming  that  3 /?  d^  is  1 unit  of  area  it  follows  that  the 

4 t 

repeater  density  is 

1.5  for  triangular  tesselations 
1.293  for  square  tesselations 
and  1 for  hexagonal  tesselations 

In  other  words,  design  (i)  requires  50%  more  equipment  than 
(iii)  and  (ii)  require  ^30%  more  equipment  than  (iii).  Obviously, 
if  a regular  grid  yields  the  optimal  solution  then  the  one  created 
by  the  hexagonal  tesselation  (iii)  is  the  optimal  solution.  If 
q £ 3 and  dfc  £ dr  then  the  corresponding  graph  G contains  the  sub- 
graph given  by  the  tesselation  (iii)  which  is  obviously  3-connected. 
If  dr  £ 1 3 dfc  and  q _<  6 again  the  connectedness  constraint  is 
satisf  i*-'  d. 

One  should  observe  some  inefficiencies  in  this  covering.  In 
particular  some  areas  are  covered  by  three  separate  repeaters 
rather  than  2 (as  would  ideally  be  the  case) . If  we  define  the 
thicknees  of  a cover  as  being  the  average  "thickness"  of  the  layer 
of  discs  covering  the  plane,  then  the  optimal  solution  for  k=l  has 
thickness  1.209  whereas  the  thickness  of  the  conjectured  optimal 
solution  for  k=2  is  2.418.  Thus,  in  both  cases,  we  have  a 21% 
"waste".  In  Figure  10,  a solution  is  shown  with  the  area  covered 
three  times  shaded.  The  question  for  k>2  is  open. 

Remark . The  "flat  terrain"  results  yield  obvious  lower  bounds 
for  the  "hilly  terrain"  problem. 
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FIGURE  10 

SOLUTION  TO  TESSELTATION  OF  P BY  HEXAGONS 
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